Data Visualization to Enhance our Understanding of the Cancer Genome

68
Data Visualization to Enhance our Understanding of the Cancer Genome HARVARD MEDICAL SCHOOL DEPARTMENT OF BIOMEDICAL INFORMATICS NILS GEHLENBORG @nils_gehlenborg http://gehlenborglab.org

Transcript of Data Visualization to Enhance our Understanding of the Cancer Genome

Page 1: Data Visualization to Enhance our Understanding of the Cancer Genome

Data Visualization to Enhance our Understanding of the Cancer Genome

HARVARD MEDICAL SCHOOL DEPARTMENT OF BIOMEDICAL INFORMATICS

NILS GEHLENBORG

@nils_gehlenborg http://gehlenborglab.org

Page 2: Data Visualization to Enhance our Understanding of the Cancer Genome

SAMUEL GRATZL JOHANNES KEPLER UNIVERSITY LINZ

ALEXANDER LEX UNIVERSITY OF UTAH

MARC STREIT JOHANNES KEPLER UNIVERSITY LINZ

Page 3: Data Visualization to Enhance our Understanding of the Cancer Genome

ROLE OF VISUALIZATION

Page 4: Data Visualization to Enhance our Understanding of the Cancer Genome

experiment

DATA

INSIGHT HYPOTHESIS

interpretation

hypothesis generation

Page 5: Data Visualization to Enhance our Understanding of the Cancer Genome

PUBLICATION

experiment

DATA

INSIGHT HYPOTHESIS

interpretation

hypothesis generation

Page 6: Data Visualization to Enhance our Understanding of the Cancer Genome

PUBLICATION

experiment

DATA

INSIGHT HYPOTHESIS

interpretation

hypothesis generation

PRESENTATION“Storytelling”

Page 7: Data Visualization to Enhance our Understanding of the Cancer Genome

experiment

DATA

INSIGHT HYPOTHESIS

interpretation

hypothesis generation

EXPLORATION“Pattern Discovery”

Page 8: Data Visualization to Enhance our Understanding of the Cancer Genome

experiment

DATA

INSIGHT HYPOTHESIS

interpretation

HYPOTHESIS

hypothesis generation

EXPLORATION

HYPOTHESIS-DRIVEN DISCOVERY

“Pattern Discovery”

Page 9: Data Visualization to Enhance our Understanding of the Cancer Genome

experiment

DATA

INSIGHT HYPOTHESIS

interpretation

DATA

hypothesis generation

EXPLORATION

DATA-DRIVEN DISCOVERY

“Pattern Discovery”

Page 10: Data Visualization to Enhance our Understanding of the Cancer Genome
Page 11: Data Visualization to Enhance our Understanding of the Cancer Genome

The Cancer Genome Atlas

10,000+

genomes + clinical data + molecular data

Page 12: Data Visualization to Enhance our Understanding of the Cancer Genome
Page 13: Data Visualization to Enhance our Understanding of the Cancer Genome

CANCER SUBTYPES

Page 14: Data Visualization to Enhance our Understanding of the Cancer Genome
Page 15: Data Visualization to Enhance our Understanding of the Cancer Genome
Page 16: Data Visualization to Enhance our Understanding of the Cancer Genome

mRNA expression microRNA expression

DNA methylation

protein expression

copy number variants mutation calls

clinical parameters

Page 17: Data Visualization to Enhance our Understanding of the Cancer Genome

mRNA expression - clustering

Page 18: Data Visualization to Enhance our Understanding of the Cancer Genome

C4C3C2C1

mRNA expression - clustering

Page 19: Data Visualization to Enhance our Understanding of the Cancer Genome

C4C3C2C1

mRNA expression

copy number variants

- clustering

- gene X

Page 20: Data Visualization to Enhance our Understanding of the Cancer Genome

C4C3C2C1

mRNA expression

copy number variants

DEL NORMAL AMP

- clustering

- gene X

Page 21: Data Visualization to Enhance our Understanding of the Cancer Genome

DEL NORMAL AMP

C4C3C2C1

mRNA expression

copy number variants

mutation calls

- clustering

- gene X

- gene Y

Page 22: Data Visualization to Enhance our Understanding of the Cancer Genome

DEL NORMAL AMP

C4C3C2C1

mRNA expression

copy number variants

mutation calls

WILDTYPEMUT

- clustering

- gene X

- gene Y

Page 23: Data Visualization to Enhance our Understanding of the Cancer Genome

DEL NORMAL AMP

C4C3C2C1

mRNA expression

copy number variants

mutation calls

WILDTYPEMUT

- clustering

- gene X

- gene Y

Page 24: Data Visualization to Enhance our Understanding of the Cancer Genome

DEL NORMAL AMP

C4C3C2C1

mRNA expression

copy number variants

mutation calls

WILDTYPEMUT

- clustering

- gene X

- gene Y

Page 25: Data Visualization to Enhance our Understanding of the Cancer Genome

ALGORITHMIC APPROACHES

VISUALIZATION APPROACHES

unsupervised clustering of multivariate data

integrated clustering across data types (e.g., Mo et al., PNAS, 2013)

correlation testing

integrative heatmaps

network-based stratification (Hofree et al., Nat Methods, 2014)

Page 26: Data Visualization to Enhance our Understanding of the Cancer Genome

Verhaak et al., Cancer Cell, 2010

Matrix Visualization: Publication Figure

Page 27: Data Visualization to Enhance our Understanding of the Cancer Genome

ALGORITHMIC APPROACHES

VISUALIZATION APPROACHES

unsupervised clustering of multivariate data

integrated clustering across data types (e.g., Mo et al., PNAS, 2013)

correlation testing

integrative heatmaps

network-based stratification (Hofree et al., Nat Methods, 2014)

genome browsers (UCSC Cancer Genomics Browser, IGV)

Page 28: Data Visualization to Enhance our Understanding of the Cancer Genome

Robinson et al., Nat Biotech, 2011

Genome-Centric Visualization: IGV

Page 29: Data Visualization to Enhance our Understanding of the Cancer Genome

PROBLEM 1

Visualize overlap of patient sets across two or more stratifications.

PROBLEM 2

Visualize characteristics of patient sets within a stratification of interest.

Page 30: Data Visualization to Enhance our Understanding of the Cancer Genome

A Lex, M Streit, H-J Schulz, C Partl, D Schmalstieg, PJ Park, N Gehlenborg, Comput Graph Forum, 2012 M Streit, A Lex, S Gratzl, C Partl, D Schmalstieg, H Pfister, PJ Park, N Gehlenborg, Nat Methods, 2014

Divide & Conquer Visualization: StratomeX

Page 31: Data Visualization to Enhance our Understanding of the Cancer Genome

PROBLEM 1

Visualize overlap of patient sets across two or more stratifications.

PROBLEM 2

Visualize characteristics of patient sets within a stratification of interest.

PROBLEM 3

Identify relevant stratifications, pathways, and clinical variables.

Page 32: Data Visualization to Enhance our Understanding of the Cancer Genome

Is there a mutation that overlaps with this mRNA cluster?

Is there a CNV that affects survival?

Is there a pathway that is enriched in this cluster?

Is there a mutually exclusive mutation?

Query

Stratifications Clinical Params

Pathways

GUIDED EXPLORATION

M Streit, A Lex, S Gratzl, C Partl, D Schmalstieg, H Pfister, PJ Park, N Gehlenborg, Nat Methods, 2014

Page 33: Data Visualization to Enhance our Understanding of the Cancer Genome

Query

Rank

VisualizeStratifications

Clinical Params Pathways

GUIDED EXPLORATION

M Streit, A Lex, S Gratzl, C Partl, D Schmalstieg, H Pfister, PJ Park, N Gehlenborg, Nat Methods, 2014

Page 34: Data Visualization to Enhance our Understanding of the Cancer Genome
Page 35: Data Visualization to Enhance our Understanding of the Cancer Genome
Page 36: Data Visualization to Enhance our Understanding of the Cancer Genome
Page 37: Data Visualization to Enhance our Understanding of the Cancer Genome
Page 38: Data Visualization to Enhance our Understanding of the Cancer Genome
Page 39: Data Visualization to Enhance our Understanding of the Cancer Genome
Page 40: Data Visualization to Enhance our Understanding of the Cancer Genome
Page 41: Data Visualization to Enhance our Understanding of the Cancer Genome
Page 42: Data Visualization to Enhance our Understanding of the Cancer Genome
Page 43: Data Visualization to Enhance our Understanding of the Cancer Genome
Page 44: Data Visualization to Enhance our Understanding of the Cancer Genome

AND NOW WHAT?

Page 45: Data Visualization to Enhance our Understanding of the Cancer Genome

DATA-DRIVEN DISCOVERY

experiment

DATA

INSIGHT HYPOTHESIS

interpretation

DATA

hypothesis generation

EXPLORATION“Pattern Discovery”

Page 46: Data Visualization to Enhance our Understanding of the Cancer Genome

DATA-DRIVEN DISCOVERY

PUBLICATION

experiment

DATA

INSIGHT HYPOTHESIS

interpretation

DATA

hypothesis generation

EXPLORATION

PRESENTATION“Storytelling”

“Pattern Discovery”

Page 47: Data Visualization to Enhance our Understanding of the Cancer Genome

DATA-DRIVEN DISCOVERY

DATA-DRIVEN COMMUNICATION

Page 48: Data Visualization to Enhance our Understanding of the Cancer Genome

finding figure/videoAuthoringExploration Presentation

DATA-DRIVEN DISCOVERY

DATA-DRIVEN COMMUNICATION

finding figure/videoAuthoringExploration Presentation

Current Model

Page 49: Data Visualization to Enhance our Understanding of the Cancer Genome

DATA-DRIVEN DISCOVERY

DATA-DRIVEN COMMUNICATION

finding figure/videoAuthoringExploration PresentationWhat we show.

Page 50: Data Visualization to Enhance our Understanding of the Cancer Genome

DATA-DRIVEN DISCOVERY

DATA-DRIVEN COMMUNICATION

finding figure/videoAuthoringExploration Presentation

What we tell.

Page 51: Data Visualization to Enhance our Understanding of the Cancer Genome

DATA-DRIVEN DISCOVERY

DATA-DRIVEN COMMUNICATION

finding figure/videoAuthoringExploration Presentation

What we did.

Page 52: Data Visualization to Enhance our Understanding of the Cancer Genome

DATA-DRIVEN DISCOVERY

DATA-DRIVEN COMMUNICATION

Page 53: Data Visualization to Enhance our Understanding of the Cancer Genome

DATA-DRIVEN DISCOVERY

DATA-DRIVEN COMMUNICATION

track provenance

annotate observations

make sense of observations

tell the story

Page 54: Data Visualization to Enhance our Understanding of the Cancer Genome

DATA-DRIVEN DISCOVERY

DATA-DRIVEN COMMUNICATION

Capture

Label

Understand

Explain

track provenance

annotate observations

make sense of observations

tell the story

C

L

U

E

Page 55: Data Visualization to Enhance our Understanding of the Cancer Genome

CLUE

vistories

Authoring

Exploration Presentation

DATA-DRIVEN DISCOVERY

DATA-DRIVEN COMMUNICATION

CLUE Model

Page 56: Data Visualization to Enhance our Understanding of the Cancer Genome

Exploration

Authoring

Presentation

Page 57: Data Visualization to Enhance our Understanding of the Cancer Genome

Exploration

Authoring

Presentation

Page 58: Data Visualization to Enhance our Understanding of the Cancer Genome

Exploration

Authoring

Presentation

Page 59: Data Visualization to Enhance our Understanding of the Cancer Genome

Exploration

Authoring

Presentation

Page 60: Data Visualization to Enhance our Understanding of the Cancer Genome

Exploration

Authoring

Presentation

Page 61: Data Visualization to Enhance our Understanding of the Cancer Genome

Exploration

Authoring

Presentation

Page 62: Data Visualization to Enhance our Understanding of the Cancer Genome

Exploration

Authoring

Presentation

Page 63: Data Visualization to Enhance our Understanding of the Cancer Genome

Exploration

Authoring

Presentation

Page 64: Data Visualization to Enhance our Understanding of the Cancer Genome

VISTORY = visual story + historyDo collaborative data analysis.

Use during peer-review.

Publish with a paper.

Embed in a presentation.

Page 65: Data Visualization to Enhance our Understanding of the Cancer Genome

DATA-DRIVEN DISCOVERY

DATA-DRIVEN COMMUNICATION

Page 66: Data Visualization to Enhance our Understanding of the Cancer Genome

DATA-DRIVEN DISCOVERY

DATA-DRIVEN COMMUNICATION

Page 67: Data Visualization to Enhance our Understanding of the Cancer Genome

DATA-DRIVEN DISCOVERY

DATA-DRIVEN COMMUNICATION

http://vistories.orgDemos and prototypes built with

Page 68: Data Visualization to Enhance our Understanding of the Cancer Genome

We are hiring postdocs & developers!

HARVARD MEDICAL SCHOOL DEPARTMENT OF BIOMEDICAL INFORMATICS

See http://gehlenborglab.org or http://dbmi.med.harvard.edu for details.

Data visualization, analysis, and management for: • genomic structural variants • dynamics of the 3D genome • cancer subtypes in patient cohorts • exploration tools for data repositories • provenance graphs