Predicting the impact of mutations using pathway- guided
integrative genomics AACR Annual Meeting 2012 Major Symposium:
Designing Rational Combination Therapy for Cancer Josh Stuart, UC
Santa Cruz Apr 3, 2012
Slide 2
Disclosure SAB for Five3 Genomics
Slide 3
Overview of pathway-guided approach Integrate many data sources
to gain accurate view of how genes are functioning in pathways
Predict the functional consequences of mutations by quantifying the
effect on the surrounding pathway Use pathway signatures to
implicate mutations in novel genes to (re-)focus targeting Identify
critical Achilles Heels in the pathways that distinguish a
particular sub-type
Slide 4
Flood of Data Analysis Challenges Multiple, Possibly
Conflicting Signals Multiple, Possibly Conflicting Signals This is
What it Does to You This is What it Does to You Genomics,
Functional Genomics, Metabolomics, Epigenomics =
Slide 5
Analysis of disease samples like automotive repair (or
detective work or other sleuthing) Patient Sample 1 Patient Sample
2 Patient Sample 3Patient Sample N
Slide 6
Much Cell Machinery Known: Gene circuitry now available.
Curated and/or Collected ReactomeKEGGBiocartaNCI-PID Pathway
Commons
Slide 7
Integration key to interpret gene function Expression not
always an indicator of activity Downstream effects often provide
clues TF Inference: TF is OFF (high expression but inactive) TF
Inference: TF is ON (low-expression but active ) TF Inference: TF
is ON (expression reflects activity) Expression of 3 transcription
factors: high low
Slide 8
Integration key to interpret gene function Need multiple data
modalities to get it right. TF Expression -> TF ON BUT, targets
are amplified Lowers our belief in active TF because explained away
by cis evidence. Copy Number -> TF OFF
Slide 9
Probabilistic Graphical Models: A Language for Integrative
Genomics Generalize HMMs, Kalman Filters, Regression, Boolean Nets,
etc. Language of probability ties together multiple aspects of gene
function & regulation Enable data-driven discovery of
biological mechanisms Foundation: J. Pearl, D. Heckerman, E.
Horvitz, G. Cooper, R. Schacter, D. Koller, N. Friedman, M. Jordan,
Bioinformatics: D. Peer, A. Hartemink, E. Segal, E Schadt Nir
Friedman, Science (2004) - Review
Slide 10
Integration Approach: Detailed models of gene expression and
interaction MDM2 TP53
Slide 11
Integration Approach: Detailed models of expression and
interaction MDM2 TP53 Two Parts: 1.Gene Level Model (central dogma)
2. Interaction Model (regulation)
Slide 12
PARDIGM Gene Model to Integrate Data Vaske et al. 2010.
Bioinformatics 1. Central Dogma-Like Gene Model of Activity 2.
Interactions that connect to specific points in gene regulation map
Charlie Vaske Steve Benz
Slide 13
Integrated Pathway Analysis for Cancer Integrated dataset for
downstream analysis Inferred activities reflect neighborhood of
influencearound a gene. Can boost signal for survival analysis and
mutationimpact Multimodal Data CNV mRNA meth Pathway Model of
Cancer Cohort Inferred Activities
Slide 14
TCGA Ovarian Cancer Inferred Pathway Activities Patient Samples
(247) Pathway Concepts (867) TCGA Network. 2011. Nature (lead by
Paul Spellman)
Slide 15
Ovarian: FOXM1 pathway altered in majority of serous ovarian
tumors FOXM1 Transcription Network Patient Samples (247) Pathway
Concepts (867) TCGA Network. 2011. Nature (lead by Paul
Spellman)
Slide 16
FOXM1 central to cross-talk between DNA repair and cell
proliferation in Ovarian Cancer TCGA Network. 2011. Nature (lead by
Paul Spellman)
Slide 17
PATHMARK: Identify Pathway-based markers that underlie
sub-types Identify sub-pathways that distinguish patients sub-types
(e.g. mutant vs. non-mutant, response to drug, etc) Predict
mutation impact on pathway neighborhood Identify master control
points for drug targeting. Predict outcomes with quantitative
simulations. Insight from contrast Ted Goldstein Sam Ng
Slide 18
Pathway signatures of mutations to reveal therapeutic
candidates Mutated genes are the focus of many targeted approaches.
Some patients with right mutation dont respond. Why? Many cancers
have one of several novel mutations. Can these be targeted with
current approaches? Pathway-motivated approaches: Identify
gain-of-function from loss-of-function. Compare novel signatures
Sam Ng Poster # 2985
Slide 19
FG High Inferred Activity Low Inferred Activity Predicted
Loss-Of-Function Predicted Gain-Of-Function PARADIGM-Shift: Pathway
context of GOF and LOF events Sam Ng Use pathways to predict the
impact of observed mutations in patient tumors
Slide 20
PARADIGM-Shift Predicting the Impact of Mutations On Genetic
Pathways FG Inference using all neighbors FG Inference using
downstream neighbors FG Inference using upstream neighbors Sam Ng
High Inferred Activity Low Inferred Activity mutated gene
Slide 21
FG PARADIGM-Shift Calculation Overview Sam Ng
Slide 22
FG 1. Identify Local Neighbor- hood PARADIGM-Shift Calculation
Overview Sam Ng
Slide 23
FG 1. Identify Local Neighbor- hood FG 2a. Regulators Run 2b.
Targets Run PARADIGM-Shift Calculation Overview Sam Ng
Slide 24
FG 1. Identify Local Neighbor- hood FG 2a. Regulators Run 2b.
Targets Run P-Shift Score 3. Calculate Difference FG - (LOF)
PARADIGM-Shift Calculation Overview Sam Ng
Slide 25
P-Shift Predicts RB1 Loss-of-Function in GBM Shift Score
PARADIGM downstream PARADIGM upstream Expression Mutation RB1 Sam
Ng
Slide 26
RB1 Network (GBM) Sam Ng Neighbor Gene Key Activity Expression
RB1 Mutation Focus Gene Key P-Shift Expression Mutation T-Run
R-Run
Slide 27
RB1 Discrepancy Scores distinguish mutated vs non-mutated
samples Signal Score (t-statistic) = -5.78 Sam Ng
Slide 28
RB1 discrepancy distinction is significant Given the same
network topology, how likely would we call a gain/loss of function
Background model: permute gene labels in our dataset Compare
observed signal score to signal scores (SS) obtained from
background model Observed SS Background SS Sam Ng
Slide 29
TP53 Network Sam Ng
Slide 30
Gain-of-Function (LUSC) NFE2L2 P-Shift Score PARADIGM
downstream PARADIGM upstream Expression Mutation Sam Ng
Discrepancy scores are sensitive Discrepancy Score Signal Score
(t-statistic) = -5.78 Signal Score (t-statistic) = 4.985 RB1NFE2L2
Signal Score (t-statistic) = -10.94 TP53 Observed SS Background SS
Sam Ng
Slide 33
Expect passenger mutations to lack shifts Is the discrepancy
specific? Negative control: calculate scores for passenger
mutations Passengers: insignificant by MutSig (p > 0.10)
well-represented in our pathways Discrepancy of these neutral
mutations should be close to whats expected by chance (from
permuted) Sam Ng
Slide 34
Discrepancies of Passenger Mutations are NOT distinctive Sam
Ng
Slide 35
PARADIGM-Shift gives orthogonal view of the importance of
mutations in LUSC Enables probing into infrequent events Can detect
non-coding mutation impact (pseudo FPs) Can detect presence of
pathway compensation for those seemingly functional mutations
(pseudo FPs) Extend beyond mutations Limited to genes w/ pathway
representation NFE2L2 (29) CDKN2A (n=30) Pathway Discrepancy LUSC
MET (n=7) (gefitinib resistance) HIF3A (n=7) TBC1D4 (n=9) (AKT
signaling) MAP2K6 (n=5) EIF4G1 (n=20) GLI2 (n=10) (SHH signaling)
AR (n=8) Sam Ng
Slide 36
Defining Pathway Signatures for Mutations and Sub-Types Build a
signature for every mutation and tumor/clinical event. Correlate
every signature to each other. Reveals common molecular
similarities between different divisions of patient subgroups
Mutations in novel genes may phenocopy mutations in known genes Ted
Goldstein
Slide 37
PathMark: Differential Subnetworks from a SuperPathway Pathway
Activities Ted Goldstein Sam Ng
Slide 38
PathMark: Differential Subnetworks from a SuperPathway
SuperPathway Activities Pathway Signature Ted Goldstein Sam Ng
Slide 39
Traditional methods treat each gene as a separate feature
Traditional methods treat each gene as a separate feature Use
features reflecting overall pathway activity Use features
reflecting overall pathway activity Smaller number of features are
now fed to predictors Smaller number of features are now fed to
predictors Predictor PIL: Pathway-informed Learning Artem
Sokolov
Slide 40
Traditional methods treat each gene as a separate feature
Traditional methods treat each gene as a separate feature Use
features reflecting overall pathway activity Use features
reflecting overall pathway activity Smaller number of features are
now fed to predictors Smaller number of features are now fed to
predictors Predictor PIL: Pathway-informed Learning Artem
Sokolov
Slide 41
Basal vs. Luminal Recursive feature elimination: we train an
SVM, drop the least important half of features and recurse The
number of times each feature survived the elimination across 100
random splits of data Artem Sokolov
Slide 42
Methotrexate Sensitivity Non sub-type specific drug Artem
Sokolov Pathway involving the target of the drug.
Slide 43
One large highly-connected component (size and connectivity
significant according to permutation analysis) Higher activity in
ER- Lower activity in ER- Triple Negative Breast Pathway Markers
Identified from 50 Cell Lines 980 pathway concepts 1048
interactions HIF1A/ARNT Characterized by several hubs
IL23/JAK2/TYK2 Myc/Max P53 tetramer ER FOXA1 Sam Ng, Ted
Goldstein
Slide 44
Identify master controllers using SPIA (signaling pathway
impact analysis) Impact factor: Google PageRank for Networks
Determines affect of a given pathway on each node Calculates
perturbation factor for each node in the network Takes into account
regulatory logic of interactions. Yulia Newton Googles
PageRank
Slide 45
Slight Trick: Run SPIA in reverse Reverse edges in Super
Pathway High scoring genes now those at the top of the pathway
Yulia Newton
Slide 46
Master Controller Analysis on Breast Cell Lines Basal Luminal
Yulia Newton
Slide 47
Master regulators predict response to drugs: PLK3 predicted as
a target for basal breast Up Down DNA damage network is upregulated
in basal breast cancers Basal breast cancers are sensitive to PLK
inhibitors BasalClaudin-lowLuminal GSK-PLKi Heiser et al. 2011 PNAS
Ng, Goldstein
Slide 48
HDAC Network is down- regulated in basal breast cancer cell
lines Basal/CL breast cancers are resistant to HDAC inhibitors
VORINOSTAT HDAC inhibitor HDAC inhibitors predicted for luminal
breast Heiser et al. 2011 PNAS Ng, Goldstein
Slide 49
PARADIGM in TCGA patient BRCA tumors Christina Yau, Buck
Inst
Slide 50
Christina Yau, Buck Inst.
Slide 51
Connect genomic alterations to downstream expression/activity
What circuitry connects mutations to transcriptional changes?
Mutations general (epi-) genomic perturbation Expression activity
Mutation/perturbation and expression/activity treated as heat
diffusing on a network HotNet, Vandin F, Upfal E, B.J. Raphael,
2008. HotNet used in ovarian to implicate Notch pathway Find
subnetworks that link genetic to mRNA and protein-level changes. ?
Evan Paull
Mutation Association to Pathways What pathway activities is a
mutations presence associated? Can we classify mutations based on
these associations? Mutations PARADIGM Signatures Ted
Goldstein
Slide 62
Mutation Association to Pathways What pathway activities is a
mutations presence associated? Can we classify mutations based on
these associations? (Note: CRC figure below; soon for BRCA)
Mutations PARADIGM Signatures APC and other Wnt Ted Goldstein
Slide 63
Mutation Association to Pathways What pathway activities is a
mutations presence associated? Can we classify mutations based on
these associations? Mutations PARADIGM Signatures Ted
Goldstein
Slide 64
Mutation Association to Pathways What pathway activities is a
mutations presence associated? Can we classify mutations based on
these associations? (Note: CRC figure below; soon for BRCA)
Mutations PARADIGM Signatures TGFB Pathway mutations Ted
Goldstein
Slide 65
Mutation Association to Pathways What pathway activities is a
mutations presence associated? Can we classify mutations based on
these associations? (Note: CRC figure below; soon for BRCA)
Mutations PARADIGM Signatures PIK3CA, RTK pathway, KRAS Ted
Goldstein
Slide 66
Mutation Association to Pathways What pathway activities is a
mutations presence associated? Can we classify mutations based on
these associations? (Note: CRC figure below; soon for BRCA)
Mutations PARADIGM Signatures Ted Goldstein
Slide 67
Summary Modeling information flow on known pathways gives view
of gene activity.Modeling information flow on known pathways gives
view of gene activity. Patient stratification into pathway-based
subtypesPatient stratification into pathway-based subtypes
Sub-networks provide pathway-based signatures of sub- types and
mutations.Sub-networks provide pathway-based signatures of sub-
types and mutations. Loss- and gain-of-function predicted from
pathway neighbors for even rare mutations.Loss- and
gain-of-function predicted from pathway neighbors for even rare
mutations. Identify interlinking genes associated with mutations to
implicate additional targets even in LOF casesIdentify interlinking
genes associated with mutations to implicate additional targets
even in LOF cases E.g. Target MYC-related pathways in certain TP53-
deficient cells?E.g. Target MYC-related pathways in certain TP53-
deficient cells? Current work: Use pathways to now simulate the
affects of specific knock-downs.Current work: Use pathways to now
simulate the affects of specific knock-downs.
Slide 68
James Durbin Chris Szeto UCSC Integrative Genomics Group Sam Ng
Ted Golstein Dan Carlin Evan Paull Artem Sokolov Chris Wong Marcos
Woehrmann Yulia Newton
Slide 69
Acknowledgments UCSC Cancer Genomics Kyle Ellrott Brian Craft
Chris Wilks Amie Radenbaugh Mia Grifford Sofie Salama Steve Benz
UCSC Genome Browser Staff Mark Diekins Melissa Cline Jorge Garcia
Erich Weiler Buck Institute for Aging Christina Yau Sean Mooney
Janita Thusberg Collaborators Joe Gray, LBL Laura Heiser, LBL Eric
Collisson, UCSF Nuria Lopez-Bigas, UPF Abel Gonzalez, UPF Funding
Agencies NCI/NIH SU2C NHGRI AACR UCSF Comprehensive Cancer Center
QB3 Jing Zhu David Haussler Chris Benz, Broad Institute Gaddy Getz
Mike Noble Daniel DeCara
Slide 70
UCSC Cancer Browser genome-cancer.ucsc.edu Jing Zhu