PROTEOMICS De novo sequence prediction for: nsi78_11.1803.1806.2.dta SequenceAbsoluteRelative...

47
PROTEOMICS De novo sequence prediction for: nsi78_11.1803.1806.2.dta Sequence Absolute Relative Probability Probability CRGSVNFP[PL]FK 3.9% 36.3% CRGSVN[DE][PL]FK 2.3% 24.7% CRGSVPFN[PN]FK 6.1% 17.2% CRGSV[SR]D[PL]FK 3.1% 6.5% CRGSVPFNWGDK <0.1% 2.7%
  • date post

    21-Dec-2015
  • Category

    Documents

  • view

    214
  • download

    1

Transcript of PROTEOMICS De novo sequence prediction for: nsi78_11.1803.1806.2.dta SequenceAbsoluteRelative...

Page 1: PROTEOMICS De novo sequence prediction for: nsi78_11.1803.1806.2.dta SequenceAbsoluteRelative Probability CRGSVNFP[PL]FK 3.9%36.3% CRGSVN[DE][PL]FK 2.3%24.7%

PROTEOMICS

De novo sequence prediction for:nsi78_11.1803.1806.2.dta

Sequence Absolute RelativeProbability Probability

CRGSVNFP[PL]FK 3.9% 36.3%CRGSVN[DE][PL]FK 2.3% 24.7%CRGSVPFN[PN]FK 6.1% 17.2%CRGSV[SR]D[PL]FK 3.1% 6.5%CRGSVPFNWGDK <0.1% 2.7%

Page 2: PROTEOMICS De novo sequence prediction for: nsi78_11.1803.1806.2.dta SequenceAbsoluteRelative Probability CRGSVNFP[PL]FK 3.9%36.3% CRGSVN[DE][PL]FK 2.3%24.7%

Genomics DNA (Gene)

FunctionalGenomics

Transcriptomics RNA

Proteomics PROTEIN

Metabolomics METABOLITE

Transcription

Translation

Enzymatic reaction

The “omics” nomenclature…

Page 3: PROTEOMICS De novo sequence prediction for: nsi78_11.1803.1806.2.dta SequenceAbsoluteRelative Probability CRGSVNFP[PL]FK 3.9%36.3% CRGSVN[DE][PL]FK 2.3%24.7%

GenTranscriptProteMetabol

~ome Sequence of a complete set of

GenesTranscriptsProteinsMetabolites

=

GenProte

~omics = Analysis of the GenomeProteome

A few definitions…

Page 4: PROTEOMICS De novo sequence prediction for: nsi78_11.1803.1806.2.dta SequenceAbsoluteRelative Probability CRGSVNFP[PL]FK 3.9%36.3% CRGSVN[DE][PL]FK 2.3%24.7%

Current -omics

Page 5: PROTEOMICS De novo sequence prediction for: nsi78_11.1803.1806.2.dta SequenceAbsoluteRelative Probability CRGSVNFP[PL]FK 3.9%36.3% CRGSVN[DE][PL]FK 2.3%24.7%

The proteome is defined as the set of all expressed proteins in a cell, tissue or

organism (Wilkins et al., 1997).

Proteomics can be defined as the systematic analysis of proteins for their

identity, quantity and function.

Page 6: PROTEOMICS De novo sequence prediction for: nsi78_11.1803.1806.2.dta SequenceAbsoluteRelative Probability CRGSVNFP[PL]FK 3.9%36.3% CRGSVN[DE][PL]FK 2.3%24.7%

Proteome Genome

dynamic static

No amplification possible

Amplification possible

Hetergenous molecules

Homogenous molecules

Large variability of the amount

No variability of the amount

Page 7: PROTEOMICS De novo sequence prediction for: nsi78_11.1803.1806.2.dta SequenceAbsoluteRelative Probability CRGSVNFP[PL]FK 3.9%36.3% CRGSVN[DE][PL]FK 2.3%24.7%

Complexity of the proteome

Page 8: PROTEOMICS De novo sequence prediction for: nsi78_11.1803.1806.2.dta SequenceAbsoluteRelative Probability CRGSVNFP[PL]FK 3.9%36.3% CRGSVN[DE][PL]FK 2.3%24.7%

Applications of Proteomics• Mining: identification of proteins (catalog the

proteins)• Quantitative proteomics: defining the relative or

absolute amount of a protein• Protein-expression profile: identification of

proteins in a particular state of the organism• Protein-network mapping: protein interactions in

living systems• Mapping of protein modifications: how and

where proteins are modified.

Page 9: PROTEOMICS De novo sequence prediction for: nsi78_11.1803.1806.2.dta SequenceAbsoluteRelative Probability CRGSVNFP[PL]FK 3.9%36.3% CRGSVN[DE][PL]FK 2.3%24.7%

Proteins classes for Analysis

• Membrane

• Soluble proteins

• Organelle-specific

• Chromosome-associated

• Phosphorylated

• Glycosylated

• Multi-protein complexes

Page 10: PROTEOMICS De novo sequence prediction for: nsi78_11.1803.1806.2.dta SequenceAbsoluteRelative Probability CRGSVNFP[PL]FK 3.9%36.3% CRGSVN[DE][PL]FK 2.3%24.7%

General flow for

proteomics analysis

SEPA

RA

TIO

N

IDE

NTIF

ICA

TIO

N

Debora Frigi Rodrigues
You do your experiment, than extract the protein, than obtain a protein mixture, that you are going to separate through 2 dimensions (usually the first dimension is by the protein charge and the second dimension by the mass of the protein. Could be in a gel, like it's shown in here but could be also by Liquid chromatography and or Mass Spectrometry.
Page 11: PROTEOMICS De novo sequence prediction for: nsi78_11.1803.1806.2.dta SequenceAbsoluteRelative Probability CRGSVNFP[PL]FK 3.9%36.3% CRGSVN[DE][PL]FK 2.3%24.7%

Current Proteomics Technologies• Proteome profiling/separation

– 2D SDS PAGE (two-dimensional sodium dodecylsulphate polyacrylamide gel electrophoresis)

– 2-D LC/LC (LC = Liquid Chromatography)– 2-D LC/MS (MS= Mass spectrometry)

• Protein identification– Peptide mass fingerprint– Tandem Mass Spectrometry (MS/MS)

• Quantative proteomics

- ICAT (isotope-coded affinity tag)

- SILAC (stable isotopic labeling of amino acids)

Page 12: PROTEOMICS De novo sequence prediction for: nsi78_11.1803.1806.2.dta SequenceAbsoluteRelative Probability CRGSVNFP[PL]FK 3.9%36.3% CRGSVN[DE][PL]FK 2.3%24.7%

The first dimension (separation by isoelectric focusing)- gel with an immobilised pH gradient- electric current causes charged proteins to move until it reaches the isoelectric point (pH gradient makes the net charge 0)

2D-PAGE gel

Page 13: PROTEOMICS De novo sequence prediction for: nsi78_11.1803.1806.2.dta SequenceAbsoluteRelative Probability CRGSVNFP[PL]FK 3.9%36.3% CRGSVN[DE][PL]FK 2.3%24.7%

Isoelectric point (pI)

• Separation by charge:

4

5

6

7

8

9

10

Sta

ble

pH

g

rad

ien

t

High pH: protein is negatively charged

Low pH:Protein is positively charged

At the isolectric point the protein has no net charge and therefore no longer migrates in the electric field.

Page 14: PROTEOMICS De novo sequence prediction for: nsi78_11.1803.1806.2.dta SequenceAbsoluteRelative Probability CRGSVNFP[PL]FK 3.9%36.3% CRGSVN[DE][PL]FK 2.3%24.7%

The first dimension (separation by isoelectric focusing)- gel with an immobilised pH gradient- electric current causes charged proteins to move until it reaches the isoelectric point (pH gradient makes the net charge 0)

The second dimension (separation by mass)-pH gel strip is loaded onto a SDS gel-SDS denatures and linearises the protein (to make movement solely dependent on mass, not shape)

2D-SDS PAGE gel

Page 15: PROTEOMICS De novo sequence prediction for: nsi78_11.1803.1806.2.dta SequenceAbsoluteRelative Probability CRGSVNFP[PL]FK 3.9%36.3% CRGSVN[DE][PL]FK 2.3%24.7%

2D-SDS PAGE gel

Page 16: PROTEOMICS De novo sequence prediction for: nsi78_11.1803.1806.2.dta SequenceAbsoluteRelative Probability CRGSVNFP[PL]FK 3.9%36.3% CRGSVN[DE][PL]FK 2.3%24.7%

2D-gel technique example

Page 17: PROTEOMICS De novo sequence prediction for: nsi78_11.1803.1806.2.dta SequenceAbsoluteRelative Probability CRGSVNFP[PL]FK 3.9%36.3% CRGSVN[DE][PL]FK 2.3%24.7%

Peng, J. and Gygi, S.P. (2001) Proteomics: the move to mixtures. J. Mass Spectrom., 36, 1083-1091.

Page 18: PROTEOMICS De novo sequence prediction for: nsi78_11.1803.1806.2.dta SequenceAbsoluteRelative Probability CRGSVNFP[PL]FK 3.9%36.3% CRGSVN[DE][PL]FK 2.3%24.7%

Some limitations of 2DE:

• Limited dynamic range of detection - bias towards high abundant proteins

• Co-migration of proteins

• Separation of proteins– Basic proteins (IP > 10)– Hydrophobic proteins– Small and large proteins (< 10; >150 kDa)

Page 19: PROTEOMICS De novo sequence prediction for: nsi78_11.1803.1806.2.dta SequenceAbsoluteRelative Probability CRGSVNFP[PL]FK 3.9%36.3% CRGSVN[DE][PL]FK 2.3%24.7%

Methods for protein

identification

Page 20: PROTEOMICS De novo sequence prediction for: nsi78_11.1803.1806.2.dta SequenceAbsoluteRelative Probability CRGSVNFP[PL]FK 3.9%36.3% CRGSVN[DE][PL]FK 2.3%24.7%

Mass Spectrometry (MS) Stages• Introduce sample to the instrument• Generate ions in the gas phase• Separate ions on the basis of differences in m/z

with a mass analyzer • Detect ions

Vacuum Vacuum SystemSystem

SamplesSamples

HPLCHPLCDetectorDetector

Data Data SystemSystem

Mass Mass AnalyserAnalyser

Ionisation Ionisation MethodMethod

MALDI

ESI

Page 21: PROTEOMICS De novo sequence prediction for: nsi78_11.1803.1806.2.dta SequenceAbsoluteRelative Probability CRGSVNFP[PL]FK 3.9%36.3% CRGSVN[DE][PL]FK 2.3%24.7%

Aebersold, R. and Mann, M. (2003) Mass spectrometry-based proteomics. Nature, 422, 198-207.

Mass spectrometers used in proteomic research

Page 22: PROTEOMICS De novo sequence prediction for: nsi78_11.1803.1806.2.dta SequenceAbsoluteRelative Probability CRGSVNFP[PL]FK 3.9%36.3% CRGSVN[DE][PL]FK 2.3%24.7%

Principles of MALDI-TOF Mass

Spectrometry

Mann, M., Hendrickson, R.C. and Pandey, A. (2001) Analysis of proteins and proteomes by mass spectrometry. Annu Rev Biochem, 70, 437-473.

Page 23: PROTEOMICS De novo sequence prediction for: nsi78_11.1803.1806.2.dta SequenceAbsoluteRelative Probability CRGSVNFP[PL]FK 3.9%36.3% CRGSVN[DE][PL]FK 2.3%24.7%

Electro-spray ionisation

ESI

M + RH+ MH+ + R (in solution)

Page 24: PROTEOMICS De novo sequence prediction for: nsi78_11.1803.1806.2.dta SequenceAbsoluteRelative Probability CRGSVNFP[PL]FK 3.9%36.3% CRGSVN[DE][PL]FK 2.3%24.7%

Methods for protein

identification

Page 25: PROTEOMICS De novo sequence prediction for: nsi78_11.1803.1806.2.dta SequenceAbsoluteRelative Probability CRGSVNFP[PL]FK 3.9%36.3% CRGSVN[DE][PL]FK 2.3%24.7%

Protein identification by Peptide Mass fingerprint

• Use MS to measure the masses of proteolytic peptide fragments.

• Identification is done by matching the measured peptide masses to corresponding peptide masses from protein or nucleotide sequence databases.

Page 26: PROTEOMICS De novo sequence prediction for: nsi78_11.1803.1806.2.dta SequenceAbsoluteRelative Probability CRGSVNFP[PL]FK 3.9%36.3% CRGSVN[DE][PL]FK 2.3%24.7%

Mass spectrometry – method of separating molecules based on mass/charge ratio

Compare peptide m/z with protein databases

eg. MALDI-TOF

(trypsin)

Mass spectometry (MS)

Page 27: PROTEOMICS De novo sequence prediction for: nsi78_11.1803.1806.2.dta SequenceAbsoluteRelative Probability CRGSVNFP[PL]FK 3.9%36.3% CRGSVN[DE][PL]FK 2.3%24.7%

Protein Identification by MS

Artificial spectra built

Artificially trypsinated

Database of sequences

(i.e. SwissProt)

Spot removed from gel

Fragmented using trypsin

Spectrum of fragments generated

MATCH

Lib

rary

Page 29: PROTEOMICS De novo sequence prediction for: nsi78_11.1803.1806.2.dta SequenceAbsoluteRelative Probability CRGSVNFP[PL]FK 3.9%36.3% CRGSVN[DE][PL]FK 2.3%24.7%

Advantages vs. Disadvantages

• Determination of MW

• High-throughput capability

• Relative low costs

• Ambiguous results difficult to interpret

• Requires sequence databases for analysis

Limitations can be overcome by peptide sequencing using tandem mass spectrometry

Page 30: PROTEOMICS De novo sequence prediction for: nsi78_11.1803.1806.2.dta SequenceAbsoluteRelative Probability CRGSVNFP[PL]FK 3.9%36.3% CRGSVN[DE][PL]FK 2.3%24.7%

How the protein sequencing works?

• Use Tandem MS: two mass analyzer in series with a collision cell in between

• Collision cell: a region where the ions collide with a gas (He, Ne, Ar) resulting in fragmentation of the ion

• Fragmentation of the peptides in the collision cell occur in a predictable fashion, mainly at the peptide bonds (also phosphoester bonds)

• The resulting daughter ions have masses that are consistent with known molecular weights of dipeptides, tripeptides, tetrapeptides…

Ser-Glu-Leu-Ile-Arg-Trp

Collision Cell

Ser-Glu-Leu-Ile-Arg

Ser-Glu-Leu

Ser-Glu-Leu-Ile

Etc…

Page 31: PROTEOMICS De novo sequence prediction for: nsi78_11.1803.1806.2.dta SequenceAbsoluteRelative Probability CRGSVNFP[PL]FK 3.9%36.3% CRGSVN[DE][PL]FK 2.3%24.7%

Peng, J. and Gygi, S.P. (2001) Proteomics: the move to mixtures. J Mass Spectrom, 36, 1083-1091.

Page 32: PROTEOMICS De novo sequence prediction for: nsi78_11.1803.1806.2.dta SequenceAbsoluteRelative Probability CRGSVNFP[PL]FK 3.9%36.3% CRGSVN[DE][PL]FK 2.3%24.7%

Schematic of a quadrupole TOF instrument

After traversing a countercurrent gas stream (curtain gas), the ions enter the vacuum system and are focused into the first quadrupole section (q0). They can be mass-separated in Q1 and dissociated in q2. Ions enter the time-of-flight analyzer through a grid and are pulsed into the reflector and onto the detector, where they are recorded. There are 14,000 pulsing events per second. Mann, M., Hendrickson, R.C. and Pandey, A. (2001) Analysis of proteins and proteomes by mass spectrometry. Annu Rev Biochem, 70, 437-473.

Page 33: PROTEOMICS De novo sequence prediction for: nsi78_11.1803.1806.2.dta SequenceAbsoluteRelative Probability CRGSVNFP[PL]FK 3.9%36.3% CRGSVN[DE][PL]FK 2.3%24.7%

Peptide Fragmentation

Page 34: PROTEOMICS De novo sequence prediction for: nsi78_11.1803.1806.2.dta SequenceAbsoluteRelative Probability CRGSVNFP[PL]FK 3.9%36.3% CRGSVN[DE][PL]FK 2.3%24.7%
Page 35: PROTEOMICS De novo sequence prediction for: nsi78_11.1803.1806.2.dta SequenceAbsoluteRelative Probability CRGSVNFP[PL]FK 3.9%36.3% CRGSVN[DE][PL]FK 2.3%24.7%

Isolates individual peptide fragments for 2nd mass spec – can obtain peptide sequence

Compare peptide sequence with protein

databases

(trypsin)

Tandem Mass Spectrometry

Page 36: PROTEOMICS De novo sequence prediction for: nsi78_11.1803.1806.2.dta SequenceAbsoluteRelative Probability CRGSVNFP[PL]FK 3.9%36.3% CRGSVN[DE][PL]FK 2.3%24.7%

Advantages vs. Disadvantages

• Determination of MW and aa. Sequence

• Detection of posttranslational modifications

• High-throughput capability

• High capital costs

• Requires sequence databases for analysis

Page 37: PROTEOMICS De novo sequence prediction for: nsi78_11.1803.1806.2.dta SequenceAbsoluteRelative Probability CRGSVNFP[PL]FK 3.9%36.3% CRGSVN[DE][PL]FK 2.3%24.7%

LCIon trap

MS75 µm RP

200 nL to MSPeptide:1. MW2. Sequence3. Modification

Tryptic digested proteins

Coupling of LC and tandem MS

Page 38: PROTEOMICS De novo sequence prediction for: nsi78_11.1803.1806.2.dta SequenceAbsoluteRelative Probability CRGSVNFP[PL]FK 3.9%36.3% CRGSVN[DE][PL]FK 2.3%24.7%

Polypeptides enter the column in the mobile phase……the hydrophobic “foot” of the polypeptides adsorb to the

hydrophobic (non polar) surface of the reverse-phase material (stationary phase) where they remain until……the organic modifier concentration rises to critical

concentration and desorbs the polypeptides

Reverse Phase column

Page 39: PROTEOMICS De novo sequence prediction for: nsi78_11.1803.1806.2.dta SequenceAbsoluteRelative Probability CRGSVNFP[PL]FK 3.9%36.3% CRGSVN[DE][PL]FK 2.3%24.7%

0 20 40 60 80 100 120 140 160 180 200

Time (min)

0

10

20

30

40

50

60

70

80

90

100

0

10

20

30

40

50

60

70

80

90

10047.64

75.8157.90

82.90 104.24111.7774.48

134.7846.013.39 26.43 140.20 146.61 206.18160.29 181.98

47.97

83.07

82.0770.11

85.56 102.41 126.8946.01

134.7843.6329.48 144.13

172.59163.9727.2919.24 181.98 197.48

NL: 2.83E9

TIC MS RS_Contest_04

NL: 4.22E8

Base Peak m/z= 400.0-2000.0 F: + c

Full ms [ 400.00-2000.00] MS RS_Contest_04

Data acquired - Chromatogram

Page 40: PROTEOMICS De novo sequence prediction for: nsi78_11.1803.1806.2.dta SequenceAbsoluteRelative Probability CRGSVNFP[PL]FK 3.9%36.3% CRGSVN[DE][PL]FK 2.3%24.7%

Triple Play

RT: 120.99 - 124.07

121.0 121.5 122.0 122.5 123.0 123.5 124.0Time (min)

0

50

1000

50

1000

50

1000

50

1000

50

100

Rel

ativ

e A

bund

ance

0

50

1000

50

1000

50

1004516

4504

4507

4516

4513

4519

4528

NL: 1.14E7Base Peak m/z= 400.0-2000.0 F: + c Full ms [ 400.00-2000.00] MS

RS_Contest_04

m/z= 626.6

m/z= 852.3

m/z= 872.5

m/z= 865.0

m/z= 684.0

m/z= 774.5

m/z= 1046.1

Page 41: PROTEOMICS De novo sequence prediction for: nsi78_11.1803.1806.2.dta SequenceAbsoluteRelative Probability CRGSVNFP[PL]FK 3.9%36.3% CRGSVN[DE][PL]FK 2.3%24.7%

+ c Full ms [ 400.00-2000.00]

400 600 800 1000 1200 1400 1600 1800 2000m/z

0

5

10

15

20

25

30

35

40

45

50

55

60

65

70

75

80

85

90

95

100

Re

lativ

e A

bun

danc

e

626.3

835.5

982.4

610.21054.4

1156.2852.21157.5703.2

885.0578.8503.9 765.91217.7445.1 1469.71259.8

+ d Z ms [ 622.30-632.30]

622 623 624 625 626 627 628 629 630 631 632m/z

0

5

10

15

20

25

30

35

40

45

50

55

60

65

70

75

80

85

90

95

100 626.1

626.6

627.1

627.71

2+ c d Full ms2 [email protected] [ 160.00-

1890.00]

200 400 600 800 1000 1200 1400 1600 1800m/z

0

5

10

15

20

25

30

35

40

45

50

55

60

65

70

75

80

85

90

95

100

Re

lativ

e A

bun

danc

e

479.4535.8

828.2

957.3

715.2958.2

1070.3406.2 602.2

3

Triple Play Dynamic Exclusion

Scan 4501

Scan 4502Scan 4503

Page 42: PROTEOMICS De novo sequence prediction for: nsi78_11.1803.1806.2.dta SequenceAbsoluteRelative Probability CRGSVNFP[PL]FK 3.9%36.3% CRGSVN[DE][PL]FK 2.3%24.7%

+ c d Full ms2 [email protected] [ 220.00-2000.00]

400 600 800 1000 1200 1400 1600 1800 2000

m/z

0

5

10

15

20

25

30

35

40

45

50

55

60

65

70

75

80

85

90

95

100721.2

471.0

1261.0

697.1636.8 1141.91076.2

787.5611.5

1029.11558.2515.2340.0 830.0 1648.0930.3

+ c Full ms [ 400.00-2000.00]

400 600 800 1000 1200 1400 1600 1800 2000m/z

0

5

10

15

20

25

30

35

40

45

50

55

60

65

70

75

80

85

90

95

100626.3

835.0

982.6

610.1

957.31156.3872.0

885.2766.8 1024.3445.0 579.2 1252.6

852.2

1

+ d Z ms [ 848.00-858.00]

848 849 850 851 852 853 854 855 856 857 858m/z

0

5

10

15

20

25

30

35

40

45

50

55

60

65

70

75

80

85

90

95

100852.2

853.1

2

3

Triple Play Dynamic Exclusion

Scan 4504

Scan 4505Scan 4506

Page 43: PROTEOMICS De novo sequence prediction for: nsi78_11.1803.1806.2.dta SequenceAbsoluteRelative Probability CRGSVNFP[PL]FK 3.9%36.3% CRGSVN[DE][PL]FK 2.3%24.7%

2D - LC/MS

Peng, J. and Gygi, S.P. (2001) Proteomics: the move to mixtures. J Mass Spectrom, 36, 1083-1091.

Page 44: PROTEOMICS De novo sequence prediction for: nsi78_11.1803.1806.2.dta SequenceAbsoluteRelative Probability CRGSVNFP[PL]FK 3.9%36.3% CRGSVN[DE][PL]FK 2.3%24.7%
Page 45: PROTEOMICS De novo sequence prediction for: nsi78_11.1803.1806.2.dta SequenceAbsoluteRelative Probability CRGSVNFP[PL]FK 3.9%36.3% CRGSVN[DE][PL]FK 2.3%24.7%

Multidimensional Protein Identification Technology (MudPIT).

Whitelegge JP (2002) Plant proteomics: BLASTing out of a MudPIT. Proc Natl Acad Sci U S A 99: 11564-6.

Page 46: PROTEOMICS De novo sequence prediction for: nsi78_11.1803.1806.2.dta SequenceAbsoluteRelative Probability CRGSVNFP[PL]FK 3.9%36.3% CRGSVN[DE][PL]FK 2.3%24.7%

Koller A, Washburn MP, Lange BM, Andon NL, Deciu C, Haynes PA, Hays L, Schieltz D, Ulaszek R, Wei J, Wolters D, Yates JR, 3rd (2002) Proteomic survey of metabolic pathways in rice. Proc Natl Acad Sci U S A

5: 5.

Page 47: PROTEOMICS De novo sequence prediction for: nsi78_11.1803.1806.2.dta SequenceAbsoluteRelative Probability CRGSVNFP[PL]FK 3.9%36.3% CRGSVN[DE][PL]FK 2.3%24.7%