Rich Probabilistic Models for Gene Expression

download Rich Probabilistic Models for  Gene Expression

If you can't read please download the document

description

Rich Probabilistic Models for Gene Expression. Eran Segal (Stanford) Ben Taskar (Stanford) Audrey Gasch (Berkeley) Nir Friedman (Hebrew University) Daphne Koller (Stanford). Our Goals. Find patterns in gene expression data. j. i. A ij - mRNA level of gene i in experiment j. - PowerPoint PPT Presentation

Transcript of Rich Probabilistic Models for Gene Expression

  • Rich Probabilistic Models for Gene ExpressionEran Segal (Stanford)Ben Taskar (Stanford)Audrey Gasch (Berkeley)Nir Friedman (Hebrew University)Daphne Koller (Stanford)

  • Our GoalsFind patterns in gene expression data

  • ExperimentsGenesData OrganizationInducedRepressed

  • ExperimentsGenesStandard Clustering Organization

  • Bi-Clustering OrganizationExperimentsGenes

  • Note: rows and columns no longer correspond to genes andexperimentsDesired OrganizationDetect similarities over subsetsof genes and experiments

  • Incorporate Heterogeneous DataFind correlations directlyFocus on novel discoveries

  • Clinical informationExperimental DetailsAnnotations (GO, MIPS, YPD)ACGCCTAOur Approach

  • LevelGeneExp. clusterExperimentGene ClusterExpressionProbabilistic Relational Models(Koller & Pfeffer 98; Friedman,Getoor,Koller & Pfeffer 99)

  • LevelGeneExp. clusterExperimentGene ClusterExpressionResulting Bayesian Network

  • LevelGeneExp. clusterExperimentGene ClusterExpressionProbabilistic Relational Models

  • LevelGeneExp. clusterExperimentGene ClusterAdding Heterogeneous DataExpression

  • LevelGeneExpressionGene ClusterLipidHSFEndoplasmaticGCN4Exp. clusterExperimentExp. typeResulting Bayesian Network

  • LevelGeneExp. clusterExperimentGene ClusterLipidHSFEndoplasmaticGCN4Exp. typeExpressionProblem: Exponential Blowup

  • Solution: Context SpecificityUltra Violet Light

  • Solution: Context SpecificityLevelDNA repairUV LightGeneExpressionExperimentUV = NoUV = YesUltra Violet LightDNA repair genes transcribedDNA Damage

  • Solution: Context SpecificityLevelDNA repairUV LightGeneExpressionExperimentUltra Violet LightDNA repair genes transcribedDNA Damage

  • Modeling Context SpecificityLevelGeneExp. clusterExperimentGene ClusterLipidHSFEndoplasmaticGCN4Exp. typeExpressionGrouping = a leaf inthe tree

  • LEARNERLearning the ModelsExperimental DetailsAnnotations (GO, MIPS, YPD)ACGCCTA

  • Automatic InductionStructure Learning:Dependency structureTree structureMissing Data:Gene cluster & experiment cluster never observed

  • Learning ProcessLevelGeneExp. clusterExperimentGene ClusterLipidHSFEndoplasmaticGCN4Exp. typeExpression

  • Learning ProcessGeneExp. clusterExperimentGene ClusterLipidHSFEndoplasmaticGCN4Exp. typeExpressionExperiment SimilarityExp. Cluster = 2Level

  • Learning ProcessGeneExp. clusterExperimentGene ClusterLipidHSFEndoplasmaticGCN4Exp. typeExpressionGene SimilarityExp. Cluster = 2LevelGene Cluster = Yes

  • Learning ProcessGeneExp. clusterExperimentGene ClusterLipidHSFEndoplasmaticGCN4Exp. typeExpressionSeparability by binding siteExp. Cluster = 2LevelHSF= Yes. . . . . . Gene Cluster = Yes

  • Learning ProcessGeneExp. clusterExperimentGene ClusterLipidEndoplasmaticGCN4Exp. typeExpressionAttribute dependencies: induce cluster changesExp. Cluster = 2LevelHSFHSF= Yes. . . . . . Gene Cluster = Yes

  • Learning ProcessGeneExp. clusterExperimentGene ClusterLipidEndoplasmaticGCN4Exp. typeExpressionExp. Cluster = 2LevelHSFHSF= YesGCN4 = YesGCN4 = Yes. . . . . . . . . . . . . . . Achieved desired clusteringGene Cluster = Yes. . .

  • Yeast Stress Data (Gasch et al 2001)Measured response to stress cond.92 arraysWe selected ~900 genesAdded data: TRANSFAC, MIPSResults:15 significant TFs7 significant function categories793 Groupings

  • Context Specific Groupings

  • Context Specific Groupings

  • Example Biological FindingDiscovered grouping of 17 genesAll induced in diauxic shiftAll have 2 binding sites for MIG1 transcription factorMany not known to be regulated by MIG1Context-sensitive groupings were key to finding cluster

  • Compendium Data (Hughes et al 2000)300 samples of yeast deletion mutantsExpressionLevelGene

    AClusterGClusterLipidHSFEndoplasmaticGCN4Array/Mutated Gene

  • Level2,2Level3,2Level1,2Gene Cluster1HSF1Gene Cluster2HSF2HSF3Lipid1Lipid3Level1,1Level3,1Gene 1 mutantGene 3 mutantArray. cluster1Array. cluster3Gene 1Gene 2Gene 3Level3,2Gene Cluster4HSF4Level3,1Level2,1Gene 4Gene Cluster3Resulting Bayesian Network

  • Experimental SetupGoal: predict the effect of mutating specific genes without performing the experiment (!)

  • Experimental Setup?Lipid4Array. cluster?Level2,2Level3,2Level1,2Gene Cluster1HSF1Gene Cluster2HSF2HSF3Lipid1Lipid3Level1,1Level3,1Gene 1 mutantGene 3 mutantArray. cluster1Array. cluster3Level3,2Gene Cluster4HSF4Level3,1Level2,1Gene Cluster3Gene 4 mutant

  • Results

  • ConclusionsPresented a unified probabilistic framework:Models complex biological domainsExpressive data organizationIncorporates heterogeneous dataFuture directions:Incorporate DNA and protein sequence dataDiscover regulatory networksPaper: http://www.cs.stanford.edu/~eranSoftware (soon): http://dags.stanford.edu/bioContact: [email protected]

    Thank You!