1.proteomics coursework-3 dec2012-aky
-
Upload
amit-yadav -
Category
Science
-
view
285 -
download
0
Transcript of 1.proteomics coursework-3 dec2012-aky
Course B
WHY BOTHER WITH PROTEOMICS?
• Proteins are the machines that drive much
of biology
• Genes are merely the recipe
• The direct characterization of a sample’s
proteins en masse.
• What proteins are present?
• How much of each protein is present?
WHY NOT MICROARRAYS?
Is Proteomics the New Genomics? Jürgen Cox and Matthias Mann, Cell 130, August 10, 2007
ONE GENOME…MANY PROTEOMES
Perhaps not… they
still have a
dynamic
“proteome” code
to break. They
cannot hit a
moving target
AN ANALYTICAL CHALLENGE
Dynamic range of protein
abundances is a challenge for
separation sciences
No equivalent of PCR for
proteins-deal with µ- to nmolconcentrations
Alternate splice forms of
a gene can make different
proteins
>200 Post translational
modifications; cannot be
deduced from a gene or
mRNA
Edman sequencing cannot provide the solutions !!!
TOOLS FOR PROTEOMICS
Sequence databases
DNA
ESTs
Protein
Mass Spectrometry
Ionization
techniques
Analyzers
Software
PMF
MS/MS
De Novo Sequencing
Protein Separation Technology
2D-GE
LCMS
MASS SPECTROMETRY
The PCR for proteins ?
MASS SPECTROMETRY
Analytical method to measure the molecular or atomic
weight of samples
Slide adopted from: Dr.. Ahna Skop. Mass Spectrometry: Methods & Theory
SOFT IONIZATION METHODS
337 nm UV laser
MALDI
cyano-hydroxy
cinnamic acidGold tip needle
Fluid (no salt)
ESI
+
_
Slide adopted from: Nathan EdwardsCenter for Bioinformatics and Computational Biology(UMIACS)
MASS SPECTROMETRY PRINCIPLES
Ionizer
Sample
+
_
Mass Analyzer Detector
Slide adopted from: Nathan EdwardsCenter for Bioinformatics and Computational Biology(UMIACS)
MASS SPEC EQUATION (TOF)
mz
2Vt2
=
m = mass of ion L = drift tube length
z = charge of ion t = time of travel
V = voltage
L2
MONOISOTOPIC MASS
www.matrixscience.com
•Mass of the most abundant isotope of a molecule
•Measured in amu or Da
•Usually the lightest isotope of small molecules
UNDERSTANDING A SPECRUM
m/z
Rela
tive I
nte
nsity
853.2 854.3 1200.5
1201.0
+2
+1
(1200.5 × 2) – 2 = 2399.0
MS INSTRUMENTS
A Brief Summary of the Different Types of Mass Spectrometers Used in Proteomics
Methods in Molecular Biology, vol. 484: Functional Proteomics: Methods and Protocols
IDENTIFICATION STRATEGIES
Experimental
masses
Theoretical
Masses
(database)
1. Peptide mass fingerprinting(PMF)
2. MS/MS spectral matching
Experimental spectrumTheoretical spectra
3.De novo sequencing*
72.0 129.0 97.0 101.0 113.1 174.1
A E P T I R H2O
*Adopted from: Brian C. Searle, Proteome Software Inc. Portland, Oregon USA
4. Spectral library search
Nesvizhskii. Journal of Proteomics ,2010
PEPTIDE MASS FINGERPRINTING
A rapid way to identify proteins
PEPTIDE MASS FINGERPRINT
The proteins from a sample are separated on 2D gels
Protein of interest is digested by trypsin (or any other site specific cleavage)
Ionization of peptides in a MALDI mass spectrometer
m/z values detected and plotted as mass spectrum
PMF database search to identify the protein
PROTEASE DIGESTION
trypsin
PEPTIDE MASS FINGERPRINT
m/z
Rela
tive I
nte
nsity
PMF DATABASE SEARCH
450.2201
609.3667
698.3100
1007.5391
1199.4916
2098.9909
PEAKLIST
>gi|2924450|emb|CAA17750.1| PROBABLE FATTY-ACID-CoA LIGASE FADD18 (FRAGMENT) (FATTY-ACID-CoA
SYNTHETASE) (FATTY-ACID-CoA SYNTHASE) [Mycobacterium tuberculosis H37Rv]
MAASLSENLSCHSSNMCRLSGNAATNLERPGEEPPGDRCTRRQAVRPARTLAKKGNIPVGYYKDEKKTAETFRTINGVRYAIPGD
YAQVEEDGTVTMLGRGSVSINSGGEKVYPEEVEAALKGHPDVFDALVVGVPDPRY
GQQVAAVVQARPGCRPSLAELDSFVRSEIAGYKVPRSLWFVDEVKRSPAGKPDYRWAKEQTEARPADDVH
AGHVTSGS
>gi|15610649|ref|NP_218030.1| fatty-acid-CoA ligase [Mycobacterium tuberculosis H37Rv]
MAASLSENLSCHSSNMCRLSGNAATNLERPGEEPPGDRCTRRQAVRPARTLAKKGNIPVGYYKDEKKTAE
TFRTINGVRYAIPGDYAQVEEDGTVTMLGRGSVSINSGGEKVYPEEVEAALKGHPDVFDALVVGVPDPRY
GQQVAAVVQARPGCRPSLAELDSFVRSEIAGYKVPRSLWFVDEVKRSPAGKPDYRWAKEQTEARPADDVH
AGHVTSGS
Protein FASTA
database450.2017 (P21234)
609.2667 (P12345)
664.3300 (P89212)
1007.4251 (P12345)
1114.4416 (P89212)
1183.5266 (P12345)
1300.5116 (P21234)
1407.6462 (P21234)
1526.6211 (P89212)
1593.7101 (P89212)
1740.7501 (P21234)
2098.8909 (P12345)
in silico
digestion
OUTPUT:
2 Unknown masses
1 hit on P21234
3 hits on P12345
RESULT:
protein is P12345
22
MODIFICATIONS
Fixed modifications: will be present on any
occurrence of the affected amino acid.Eg.+57@C
Variable modifications: may be present on some
or all positions of the affected amino acid.
Eg.+16@M
Slide adopted from: Nathan EdwardsCenter for Bioinformatics and Computational Biology(UMIACS)
TANDEM MASS SPECTROMETRY
Peptide Sequencing by two stage MS
PRECURSOR SELECTION
m/z
Rela
tive I
nte
nsity
MS1
Tandem MS or MS/MS or MS2
Unfragmented
parent/precursor ion
COLLISION INDUCED DISSOCIATION
CID in presence of inert gas
26
FRAGMENTATION
PEPTIDE
MW ion ion MW
98 b1 P EPTIDE y6 703
227 b2 PE PTIDE y5 574
324 b3 PEP TIDE y4 477
425 b4 PEPT IDE y3 376
538 b5 PEPTI DE y2 263
653 b6 PEPTID E y2 148
SHOTGUN PROTEOMICS & DATABASE
SEARCH
The pros and cons of peptide-centric proteomics. Mark W. Duncan, Ruedi Aebersold, Richard M. Caprioli
Nature Biotechnology, Vol. 28, No. 7. (01 July 2010), pp. 659-664
DATABASE SEARCH ALGORITHMS
SEQUEST
Mascot
X!Tandem
OMSSA
ProbID
Phenyx
Myrimatch
MassWiz
DE NOVO SEQUENCING
Sequencing a peptide from scratch
30
DE NOVO INTERPRETATION
100
0250 500 750 1000
m/z
% I
nte
nsi
ty
Slide adopted from: Nathan EdwardsCenter for Bioinformatics and Computational Biology(UMIACS)
31
DE NOVO INTERPRETATION
100
0250 500 750 1000
m/z
% I
nte
nsi
ty
E L
Slide adopted from: Nathan EdwardsCenter for Bioinformatics and Computational Biology(UMIACS)
32
DE NOVO INTERPRETATION
100
0250 500 750 1000
m/z
% I
nte
nsi
ty
E L F
KL
SGF G
E DE
L E
E D E L
Slide adopted from: Nathan EdwardsCenter for Bioinformatics and Computational Biology(UMIACS)
33
SUMMARY
Proteomics is large-scale study (qualitative and
quantitative) study of proteins by mass spec.
Mass spectrometry + sequence databases
represent a huge leap for protein (bio-)chemistry.
ProteinSeparation - 2DGE and HPLC
Ionization - MALDI and ESI
Identification - PMF, MS/MS and de novo
sequencing
Sample prep, instruments and algorithms still
maturing, much work to be done.
NEXT…
Significance Assessment of database matches
False Discovery rate
Protein Inference