Epistasis Analysis Using Microarrays Chris Workman.

Epistasis Analysis Using Microarrays

Chris Workman

Experiments with Microarrays

Cool technology, but how do we use it? How is it useful? Identify “marker genes” in disease tissues

Classification, diagnostics Toxicology, stress response

Drug candidate screens, basic science Genetic factors Measuring interactions (chIP-on-chip)

Overview

Expression profiling in single-deletions Epistasis analysis using single- and double-

deletions Epistasis analysis, genetic and environmental

factors Reconstructing pathways that explain the

genetic relationships between genes

Expression Profiling in 276 Yeast Single-Gene Deletion Strains(“The Rosetta Compendium”)

Only 19 % of yeast genes are essential in rich media

Giaever, et. al. Nature (2002)

Clustered Rosetta Compendium Data

Gene Deletion Profiles Identify Gene Function and Pathways

Principle of Epistasis Analysis

Experimental Design

Compare single-gene deletions to wild type Compare to the double knockout to wild type

Experimental Design:Single vs Double-Gene Deletions

Classical Epistasis Analysis Using Microarrays to Determine the Molecular Phenotypes

Time series expression (0-24hrs) every 2hrs

Mixing Genetic and Environmental Factors

Expression in Single-Gene Deletions(yeast mec1 and dun1 deletion strains)

Chen-Hsiang Yeang, PhD Craig MakMIT UCSDUC Santa Cruz

Yeang, Jaakkola, Ideker. J Comp Bio (2004)Yeang, Mak, et. al. Genome Res (2005)

Measurements

Networks

“Systems level” understanding

Treat disease

Synthetic biology

In silico cells

Measurements

Networks


Treat disease

Synthetic biology

In silico cells

Test & Refine

Displaying deletion effects

Published work: “Epistasis analysis using expression profiling” (2005)

Relevant Interactions

Subset of Rosetta compendium used

28 deletions were TF (red circles)

355 diff. exp. genes (white boxes)

P < 0.005

755 TF-deletion effects (grey squiggles)

Network Measurements

Yeast under normal growth conditions

Promoter binding ChIP-chip / location analysis

Lee, et. al. Science(2002)

Protein-protein interaction Yeast 2-hybrid

Database of Interaction Proteins (DIP)

Deane, et. al. Mol Cell Proteomics (2002)

ChIP Measurement of Protein-DNA Interactions (Chromatin Immunoprecipitation)

Step 1: Network connectivity(chIP-chip analysis)

~ 5k genes (white boxes)

~ 20k interactions (green lines)

Step 2: Network annotation(gene expression analysis)

What parts are wired together

How and why the parts are wired togetherthe way they are

Measure variables that are a function of the network (gene expression).

Monitor these effects after perturbing the network (TF knockouts).

Inferring regulatory paths

=

=

Direct

Indirect

Annotate: inducer or repressor

OR

Annotate: inducer or repressor

Computational methods

Problem Statement: Find regulatory paths consisting of physical

interactions that “explain” functional relationship

Method: A probabilistic inference approach

Yeang, Ideker et. al. J Comp Bio (2004) To assign annotations Formalize problem using a factor graph Solve using max product algorithm

Kschischang. IEEE Trans. Information Theory (2001)

Mathematically similar to Bayesian inference, Markov random fields, belief propagation

Inferred Network Annotations

A network with ambiguous annotation

Test & Refine

Which deletion experiments should we do first?

I M;Y e H (M ) H M | Y e H M P M m P Y e y log2 P M m | Y e y

m, y

A mutual information based score For each candidate experiment (gene )

Variability of predicted expression profiles Predict profile for each possible set of annotations More variable = more information from experiment

Reuse network inference algorithm to compute effect of deletion!

Ranking candidate experiments

Gene Function Score Downstream genes

Rank Model

HHF1 Histone 52.1429 74 1 2

SOK2* regulator for meiosis and PKA pathway 45.0279 64 2 1

CKA1 protein kinase of cell cycle 45.0075 64 3 5

A2 mating response 40.9023 58 4 4

YAP6* stress response regulator 35.1652 50 5 1, 3

NRG1 regulator of glucose dependent genes

31.6501 45 6 3

FKH1 regulator of cell cycle 29.1194 41 7 2

FKH2 regulator of cell cycle 26.7131 38 8 7

SLT2 protein kinase of cell wall integrity pathway 23.4727 31 9 8

MSN4* regulator of stress response 21.8224 31 10 1

HAP4* regulator of cellular respiration

6.3310 9 34 1

We target experiments to one region of network

Expression for: SOK2, HAP4 , MSN4 , YAP6

Expression of Msn4 targets

Ze 1

N z ie 0

i1

N

sgn rie z ie

Average signed z-score

Expression of Hap4 targets

Yap6 targets are unaffected

Refined Network Model

Caveats Assumes target genes

are correct Only models linear paths Combinatorial effects

missed Measurements are for

rich media growth

Using this method of choosingthe next experiment

Is it better than other methods?

How many experiments?

Run simulations vs: Random Hubs

Simulation results

# simulated deletions profiles used to learn a “true” network

Current Work

Treat disease


Test & Refine

Networks

Transcriptional response to DNA damage

Measurements

Acknowledgments

Trey Ideker

Craig Mak Chen-Hsiang YeangTommi Jaakkola

Scott McCuineMaya AgarwalMike DalyIdeker lab members

Funding grants from NIGMS, NSF, and NIH

Tom BegleyLeona Samson

Epistasis Analysis Using Microarrays Chris Workman.

Documents

Transcript of Epistasis Analysis Using Microarrays Chris Workman.