Epistasis Analysis Using Microarrays Chris Workman.
-
date post
22-Dec-2015 -
Category
Documents
-
view
216 -
download
0
Transcript of Epistasis Analysis Using Microarrays Chris Workman.
Epistasis Analysis Using Microarrays
Chris Workman
Experiments with Microarrays
Cool technology, but how do we use it? How is it useful? Identify “marker genes” in disease tissues
Classification, diagnostics Toxicology, stress response
Drug candidate screens, basic science Genetic factors Measuring interactions (chIP-on-chip)
Overview
Expression profiling in single-deletions Epistasis analysis using single- and double-
deletions Epistasis analysis, genetic and environmental
factors Reconstructing pathways that explain the
genetic relationships between genes
Expression Profiling in 276 Yeast Single-Gene Deletion Strains(“The Rosetta Compendium”)
Only 19 % of yeast genes are essential in rich media
Giaever, et. al. Nature (2002)
Clustered Rosetta Compendium Data
Gene Deletion Profiles Identify Gene Function and Pathways
Principle of Epistasis Analysis
Experimental Design
Compare single-gene deletions to wild type Compare to the double knockout to wild type
Experimental Design:Single vs Double-Gene Deletions
Classical Epistasis Analysis Using Microarrays to Determine the Molecular Phenotypes
Time series expression (0-24hrs) every 2hrs
Mixing Genetic and Environmental Factors
Expression in Single-Gene Deletions(yeast mec1 and dun1 deletion strains)
Chen-Hsiang Yeang, PhD Craig MakMIT UCSDUC Santa Cruz
Yeang, Jaakkola, Ideker. J Comp Bio (2004)Yeang, Mak, et. al. Genome Res (2005)
Measurements
Networks
“Systems level” understanding
Treat disease
Synthetic biology
In silico cells
Measurements
Networks
“Systems level” understanding
Treat disease
Synthetic biology
In silico cells
Test & Refine
Displaying deletion effects
Published work: “Epistasis analysis using expression profiling” (2005)
Relevant Interactions
Subset of Rosetta compendium used
28 deletions were TF (red circles)
355 diff. exp. genes (white boxes)
P < 0.005
755 TF-deletion effects (grey squiggles)
Network Measurements
Yeast under normal growth conditions
Promoter binding ChIP-chip / location analysis
Lee, et. al. Science(2002)
Protein-protein interaction Yeast 2-hybrid
Database of Interaction Proteins (DIP)
Deane, et. al. Mol Cell Proteomics (2002)
ChIP Measurement of Protein-DNA Interactions (Chromatin Immunoprecipitation)
Step 1: Network connectivity(chIP-chip analysis)
~ 5k genes (white boxes)
~ 20k interactions (green lines)
Step 2: Network annotation(gene expression analysis)
What parts are wired together
How and why the parts are wired togetherthe way they are
Measure variables that are a function of the network (gene expression).
Monitor these effects after perturbing the network (TF knockouts).
Inferring regulatory paths
=
=
Direct
Indirect
Annotate: inducer or repressor
OR
Annotate: inducer or repressor
Computational methods
Problem Statement: Find regulatory paths consisting of physical
interactions that “explain” functional relationship
Method: A probabilistic inference approach
Yeang, Ideker et. al. J Comp Bio (2004) To assign annotations Formalize problem using a factor graph Solve using max product algorithm
Kschischang. IEEE Trans. Information Theory (2001)
Mathematically similar to Bayesian inference, Markov random fields, belief propagation
Inferred Network Annotations
A network with ambiguous annotation
Test & Refine
Which deletion experiments should we do first?
I M;Y e H (M ) H M | Y e H M P M m P Y e y log2 P M m | Y e y
m, y
A mutual information based score For each candidate experiment (gene )
Variability of predicted expression profiles Predict profile for each possible set of annotations More variable = more information from experiment
Reuse network inference algorithm to compute effect of deletion!
Ranking candidate experiments
Gene Function Score Downstream genes
Rank Model
HHF1 Histone 52.1429 74 1 2
SOK2* regulator for meiosis and PKA pathway 45.0279 64 2 1
CKA1 protein kinase of cell cycle 45.0075 64 3 5
A2 mating response 40.9023 58 4 4
YAP6* stress response regulator 35.1652 50 5 1, 3
NRG1 regulator of glucose dependent genes
31.6501 45 6 3
FKH1 regulator of cell cycle 29.1194 41 7 2
FKH2 regulator of cell cycle 26.7131 38 8 7
SLT2 protein kinase of cell wall integrity pathway 23.4727 31 9 8
MSN4* regulator of stress response 21.8224 31 10 1
HAP4* regulator of cellular respiration
6.3310 9 34 1
We target experiments to one region of network
Expression for: SOK2, HAP4 , MSN4 , YAP6
Expression of Msn4 targets
Ze 1
N z ie 0
i1
N
sgn rie z ie
Average signed z-score
Expression of Hap4 targets
Yap6 targets are unaffected
Refined Network Model
Caveats Assumes target genes
are correct Only models linear paths Combinatorial effects
missed Measurements are for
rich media growth
Using this method of choosingthe next experiment
Is it better than other methods?
How many experiments?
Run simulations vs: Random Hubs
Simulation results
# simulated deletions profiles used to learn a “true” network
Current Work
Treat disease
“Systems level” understanding
Test & Refine
Networks
Transcriptional response to DNA damage
Measurements
Acknowledgments
Trey Ideker
Craig Mak Chen-Hsiang YeangTommi Jaakkola
Scott McCuineMaya AgarwalMike DalyIdeker lab members
Funding grants from NIGMS, NSF, and NIH
Tom BegleyLeona Samson