Post on 12-Jan-2016
description
Relating Protein Abundance & mRNA Expression
Mark B GersteinYale (Comp. Bio. & Bioinformatics)
NIDA site visit at Yale
2007.12.19
[Greenbaum et al. Bioinformatics 2002, 18, 587.]
Why relate amounts of protein & mRNAGene expression - major place for regulation (easy to measure)
vs. Concentration of protein - major determinant of activity
where ks,i and kd,i are the protein synthesis and degradation rate constants
At steady state: Pi =ks;i [mRNAi ]
kd,i
= ks,i [mRNAi ] – kd,i Pi dPidt
Expectations from simple kinetic models:
Outliers from trend interesting
[Graphic: Jeong et al, Nature, 41:411]
Protein interaction networks
[Graphic: http://proton.chem.yale.edu]
Protein complexes
In protein complexes, one expects stoichiometric abundance of component proteins and that mRNA expression levels should be correlated with protein abundance
…Among pathways, this is expected to a lesser degree between interacting proteins
Relationship of Protein Abundance to Complexes and Pathways
mRNA expression levelsMicroarrays
AffymetrixPCR
SAGE
Protein abundance2D Gel Electrophoresis Multiple staining options
Small dynamic range
DIGE Cy3 vs. Cy5 labelingLarge dynamic range
ICAT, iTRAQ MS-basedRelative abundance (ratio of isotopically
labeled species)Large dynamic range
MudPIT LC-MS/MS
SILAC Stable isotope labeling with amino acids in cell culture - for MS analysis
TAP-Tag Weissman and O’Shea (Oct. 2003)
Sources of experimental data
[http://www.biology.ucsc.edu/mcd/images/microarray.gif]
PARE: proteomics.gersteinlab.org
Upload or use pre-loadedmRNA, protein datasets
Analyze all oranalyze MIPS or GO subset
[Yu et al., BMC Bioinfo. '07]
Open-source codeDownloadable
PARE: a web-based tool for correlating mRNA expression and protein abundance
PAREmain page
(1) Select mRNA, protein datasets: -use pre-loaded datasets -upload datasets
(2) Choose categorization method: -correlate all -MIPS complexes -GO biological processes -GO molecular function -GO cellular component
Select MIPS, GO subsets (opt.)
Display results (3) Display -Linear or log-log correlation for selected subset(s) -Tabulate data, correlation values for selected subset(s) -Label (on plot) and tabulate outlying datapoints
[Yu et al., BMC Bioinfo. '07]
Correlated data
Log-log plot of correlation -linear fit -outliers labeled
Calculation of mutual information
PARE output
[Yu
et a
l., B
MC
Bio
info
. '07
]
Correlation of subsets (GO, MIPS)Yeast ref. datasets: “Correlate all”vs.GO cellular component subsets for particular cellular locations
[Yu
et a
l., B
MC
Bio
info
. '07
]
PARE: pre-loaded datasets
[Yu et al., BMC Bioinfo. '07]
Connecting PARE with datasets from NIDA investigators
Protein abundance (iTRAQ datasets)
Mouse (Nairn lab) - samples from 3 brain regions: cortex, striatum, hippocampus
Green monkey (Taylor lab) - several brain regions -caudate, dlPFC, mPFC, NacC, NacS, PFC11, PFC13, PrCO, putameneach treated with saline, PCP, and cocaine
mRNA datasets obtained from expression database
Mouse - Sandberg et al. PNAS 2000, 97, 11038.
0
0.5
1
1.5
2
2.5
3
0 0.5 1 1.5 2 2.5 3mRNA (hippocampus/cortex)
pro
tein
(h
ipp
oca
mp
us/
cort
ex)
mRNA expression ratio (hippocampus/cortex)
pro
tein
ab
un
dan
ce r
atio
(h
ipp
oca
mp
us/
cort
ex)
Mouse brain: correlation of mRNA & protein expression
For 96 genes with differential expression in hippocampus vs. cortex
Plan to correlate abundance for individual pathways and complexes with C. Bruce ("John", talking later)
Acknowledgements
iTRAQ datasets:Angus Nairn, Erika Andrade, Dilja KruegerJane TaylorChris Colangelo, Mark Shifman (YPED)
PARE:Anne Burba, Eric Yu