Rich Probabilistic Models for Gene Expression
description
Transcript of Rich Probabilistic Models for Gene Expression
-
Rich Probabilistic Models for Gene ExpressionEran Segal (Stanford)Ben Taskar (Stanford)Audrey Gasch (Berkeley)Nir Friedman (Hebrew University)Daphne Koller (Stanford)
-
Our GoalsFind patterns in gene expression data
-
ExperimentsGenesData OrganizationInducedRepressed
-
ExperimentsGenesStandard Clustering Organization
-
Bi-Clustering OrganizationExperimentsGenes
-
Note: rows and columns no longer correspond to genes andexperimentsDesired OrganizationDetect similarities over subsetsof genes and experiments
-
Incorporate Heterogeneous DataFind correlations directlyFocus on novel discoveries
-
Clinical informationExperimental DetailsAnnotations (GO, MIPS, YPD)ACGCCTAOur Approach
-
LevelGeneExp. clusterExperimentGene ClusterExpressionProbabilistic Relational Models(Koller & Pfeffer 98; Friedman,Getoor,Koller & Pfeffer 99)
-
LevelGeneExp. clusterExperimentGene ClusterExpressionResulting Bayesian Network
-
LevelGeneExp. clusterExperimentGene ClusterExpressionProbabilistic Relational Models
-
LevelGeneExp. clusterExperimentGene ClusterAdding Heterogeneous DataExpression
-
LevelGeneExpressionGene ClusterLipidHSFEndoplasmaticGCN4Exp. clusterExperimentExp. typeResulting Bayesian Network
-
LevelGeneExp. clusterExperimentGene ClusterLipidHSFEndoplasmaticGCN4Exp. typeExpressionProblem: Exponential Blowup
-
Solution: Context SpecificityUltra Violet Light
-
Solution: Context SpecificityLevelDNA repairUV LightGeneExpressionExperimentUV = NoUV = YesUltra Violet LightDNA repair genes transcribedDNA Damage
-
Solution: Context SpecificityLevelDNA repairUV LightGeneExpressionExperimentUltra Violet LightDNA repair genes transcribedDNA Damage
-
Modeling Context SpecificityLevelGeneExp. clusterExperimentGene ClusterLipidHSFEndoplasmaticGCN4Exp. typeExpressionGrouping = a leaf inthe tree
-
LEARNERLearning the ModelsExperimental DetailsAnnotations (GO, MIPS, YPD)ACGCCTA
-
Automatic InductionStructure Learning:Dependency structureTree structureMissing Data:Gene cluster & experiment cluster never observed
-
Learning ProcessLevelGeneExp. clusterExperimentGene ClusterLipidHSFEndoplasmaticGCN4Exp. typeExpression
-
Learning ProcessGeneExp. clusterExperimentGene ClusterLipidHSFEndoplasmaticGCN4Exp. typeExpressionExperiment SimilarityExp. Cluster = 2Level
-
Learning ProcessGeneExp. clusterExperimentGene ClusterLipidHSFEndoplasmaticGCN4Exp. typeExpressionGene SimilarityExp. Cluster = 2LevelGene Cluster = Yes
-
Learning ProcessGeneExp. clusterExperimentGene ClusterLipidHSFEndoplasmaticGCN4Exp. typeExpressionSeparability by binding siteExp. Cluster = 2LevelHSF= Yes. . . . . . Gene Cluster = Yes
-
Learning ProcessGeneExp. clusterExperimentGene ClusterLipidEndoplasmaticGCN4Exp. typeExpressionAttribute dependencies: induce cluster changesExp. Cluster = 2LevelHSFHSF= Yes. . . . . . Gene Cluster = Yes
-
Learning ProcessGeneExp. clusterExperimentGene ClusterLipidEndoplasmaticGCN4Exp. typeExpressionExp. Cluster = 2LevelHSFHSF= YesGCN4 = YesGCN4 = Yes. . . . . . . . . . . . . . . Achieved desired clusteringGene Cluster = Yes. . .
-
Yeast Stress Data (Gasch et al 2001)Measured response to stress cond.92 arraysWe selected ~900 genesAdded data: TRANSFAC, MIPSResults:15 significant TFs7 significant function categories793 Groupings
-
Context Specific Groupings
-
Context Specific Groupings
-
Example Biological FindingDiscovered grouping of 17 genesAll induced in diauxic shiftAll have 2 binding sites for MIG1 transcription factorMany not known to be regulated by MIG1Context-sensitive groupings were key to finding cluster
-
Compendium Data (Hughes et al 2000)300 samples of yeast deletion mutantsExpressionLevelGene
AClusterGClusterLipidHSFEndoplasmaticGCN4Array/Mutated Gene
-
Level2,2Level3,2Level1,2Gene Cluster1HSF1Gene Cluster2HSF2HSF3Lipid1Lipid3Level1,1Level3,1Gene 1 mutantGene 3 mutantArray. cluster1Array. cluster3Gene 1Gene 2Gene 3Level3,2Gene Cluster4HSF4Level3,1Level2,1Gene 4Gene Cluster3Resulting Bayesian Network
-
Experimental SetupGoal: predict the effect of mutating specific genes without performing the experiment (!)
-
Experimental Setup?Lipid4Array. cluster?Level2,2Level3,2Level1,2Gene Cluster1HSF1Gene Cluster2HSF2HSF3Lipid1Lipid3Level1,1Level3,1Gene 1 mutantGene 3 mutantArray. cluster1Array. cluster3Level3,2Gene Cluster4HSF4Level3,1Level2,1Gene Cluster3Gene 4 mutant
-
Results
-
ConclusionsPresented a unified probabilistic framework:Models complex biological domainsExpressive data organizationIncorporates heterogeneous dataFuture directions:Incorporate DNA and protein sequence dataDiscover regulatory networksPaper: http://www.cs.stanford.edu/~eranSoftware (soon): http://dags.stanford.edu/bioContact: [email protected]
Thank You!