Inferring clonal composition of a breast cancer from multiple tissue samples Habil Zare Department...

45
Inferring clonal composition of a breast cancer from multiple tissue samples Habil Zare Department of Genome Sciences University of Washington 19 Dec 2013 1

Transcript of Inferring clonal composition of a breast cancer from multiple tissue samples Habil Zare Department...

Page 1: Inferring clonal composition of a breast cancer from multiple tissue samples Habil Zare Department of Genome Sciences University of Washington 19 Dec 2013.

1

Inferring clonal composition of a breast cancer

from multiple tissue samples

Habil ZareDepartment of Genome Sciences

University of Washington19 Dec 2013

Page 2: Inferring clonal composition of a breast cancer from multiple tissue samples Habil Zare Department of Genome Sciences University of Washington 19 Dec 2013.

2

Hypothesis

Because cancer is a heterogeneous disease, synergistic medications can

treat it better than a single drug.

Page 3: Inferring clonal composition of a breast cancer from multiple tissue samples Habil Zare Department of Genome Sciences University of Washington 19 Dec 2013.

3

Traditional concept of a tumor

Schematic figure

Page 4: Inferring clonal composition of a breast cancer from multiple tissue samples Habil Zare Department of Genome Sciences University of Washington 19 Dec 2013.

4

Most tumors are heterogeneous

Schematic figure

Page 5: Inferring clonal composition of a breast cancer from multiple tissue samples Habil Zare Department of Genome Sciences University of Washington 19 Dec 2013.

5

Different clones have different genotypes and phenotypes

Clone 1

Clone 2

Clone 3Clone 4Clone 5

Clone 6

Schematic figure

Page 6: Inferring clonal composition of a breast cancer from multiple tissue samples Habil Zare Department of Genome Sciences University of Washington 19 Dec 2013.

6

It is important to identify the clonal composition

Treatment A

Treatment B

Relapse

Relapse

Page 7: Inferring clonal composition of a breast cancer from multiple tissue samples Habil Zare Department of Genome Sciences University of Washington 19 Dec 2013.

7

It is important to identify the clonal composition

?

Page 8: Inferring clonal composition of a breast cancer from multiple tissue samples Habil Zare Department of Genome Sciences University of Washington 19 Dec 2013.

8

It is important to identify the clonal composition

?

Page 9: Inferring clonal composition of a breast cancer from multiple tissue samples Habil Zare Department of Genome Sciences University of Washington 19 Dec 2013.

9

Our approach to analyze multiple samples from a single tumor

Page 10: Inferring clonal composition of a breast cancer from multiple tissue samples Habil Zare Department of Genome Sciences University of Washington 19 Dec 2013.

10

Our approach to analyze multiple samples from a single tumor

Page 11: Inferring clonal composition of a breast cancer from multiple tissue samples Habil Zare Department of Genome Sciences University of Washington 19 Dec 2013.

Each sample has different information about the clonal composition

PCR

PCR

PCR

Next Gen Sequencing

Next Gen Sequencing

Next Gen Sequencing

Counting the number of reads which support each mutation

Page 12: Inferring clonal composition of a breast cancer from multiple tissue samples Habil Zare Department of Genome Sciences University of Washington 19 Dec 2013.

A closer look at the Next-Gen Sequencing output

• At each locus, 2 integers are provided: total number of analyzed reads, andthe number of reads supporting the mutation.

• Because different clones have different contributions to each sample, these numbers vary across the samples.

How to use this variation to infer the clonal composition?

Page 13: Inferring clonal composition of a breast cancer from multiple tissue samples Habil Zare Department of Genome Sciences University of Washington 19 Dec 2013.

13

The observations

The observations boils down to the number of reads which support each allele.

• M samples• Mutations on N loci

Building a generative model

Tumor

Page 14: Inferring clonal composition of a breast cancer from multiple tissue samples Habil Zare Department of Genome Sciences University of Washington 19 Dec 2013.

14

Building a generative modelGiven the parameters, how to generate data?

Data

Parameters

Generate

Page 15: Inferring clonal composition of a breast cancer from multiple tissue samples Habil Zare Department of Genome Sciences University of Washington 19 Dec 2013.

15

Data

Parameters

Generate

?

Building a generative modelGiven the parameters, how to generate data?

Page 16: Inferring clonal composition of a breast cancer from multiple tissue samples Habil Zare Department of Genome Sciences University of Washington 19 Dec 2013.

16

Generate

Building a generative model

Parameters?

Page 17: Inferring clonal composition of a breast cancer from multiple tissue samples Habil Zare Department of Genome Sciences University of Washington 19 Dec 2013.

17

The main assumption on the distribution of reads

Mutation i can be present or absent in each clone

Project on Mutation i

Building a generative model

Assumption: Reads are analyzed uniformly at random => Binomially distributed

Page 18: Inferring clonal composition of a breast cancer from multiple tissue samples Habil Zare Department of Genome Sciences University of Washington 19 Dec 2013.

18

The main assumption on the distribution of reads

Mutation i can be present or absent in each clone

Project on Mutation i

Number of reads exhibiting the variant allele at locus i in sample j.

Total number of reads

Frequency of variant allele

Assumption

Building a generative model

Page 19: Inferring clonal composition of a breast cancer from multiple tissue samples Habil Zare Department of Genome Sciences University of Washington 19 Dec 2013.

19

A close look at the binomial distribution

Total number of readsObserved

Frequency of variant allele ?

Number of reads exhibiting the variant

Observed

depends on:1. Which clones contain mutation i ?2. What is the frequency of those clones in sample j ?

Building a generative model

Page 20: Inferring clonal composition of a breast cancer from multiple tissue samples Habil Zare Department of Genome Sciences University of Washington 19 Dec 2013.

20

Introducing the hidden variables

If Zi,c = 1, clone c has a variant allele at locus i. depends on:1. Which clones contain mutation i ?2. What is the frequency of those clones in sample j ?

Building a generative model

Page 21: Inferring clonal composition of a breast cancer from multiple tissue samples Habil Zare Department of Genome Sciences University of Washington 19 Dec 2013.

21

Notation for the model parameters

depends on:1. Which clones contain mutation i ?2. What is the frequency of those clones in sample j ?

Building a generative model

Page 22: Inferring clonal composition of a breast cancer from multiple tissue samples Habil Zare Department of Genome Sciences University of Washington 19 Dec 2013.

22

Building a generative model

ParametersC

Generate

?

Page 23: Inferring clonal composition of a breast cancer from multiple tissue samples Habil Zare Department of Genome Sciences University of Washington 19 Dec 2013.

23

The assumptions

• Each mutation can occur at a locus independently at random.• The samples are independent from each other.

Building a generative model

Page 24: Inferring clonal composition of a breast cancer from multiple tissue samples Habil Zare Department of Genome Sciences University of Washington 19 Dec 2013.

24

Building a generative model

ParametersC

Generate

Technical

Page 25: Inferring clonal composition of a breast cancer from multiple tissue samples Habil Zare Department of Genome Sciences University of Washington 19 Dec 2013.

25

Overview of the generative model from parameters to the observations

C

Parameters

Observations

Page 26: Inferring clonal composition of a breast cancer from multiple tissue samples Habil Zare Department of Genome Sciences University of Washington 19 Dec 2013.

26

InferenceGiven the observed counts, how do we infer the clonal structure?

C

Inference

Technical

EM

Page 27: Inferring clonal composition of a breast cancer from multiple tissue samples Habil Zare Department of Genome Sciences University of Washington 19 Dec 2013.

27

We infer model parameters using expectation-maximization

Details omitted

Derived from the binomial distribution

Derived from Bernoulli distribution

Page 28: Inferring clonal composition of a breast cancer from multiple tissue samples Habil Zare Department of Genome Sciences University of Washington 19 Dec 2013.

28

How can we evaluate whether the model works?

Inference

Two rounds of next gen sequencing

C

Page 29: Inferring clonal composition of a breast cancer from multiple tissue samples Habil Zare Department of Genome Sciences University of Washington 19 Dec 2013.

29

We do not know the reality

~Inferred Reality

Page 30: Inferring clonal composition of a breast cancer from multiple tissue samples Habil Zare Department of Genome Sciences University of Washington 19 Dec 2013.

30

Generating synthetic data

Inference

C`

Generate

Page 31: Inferring clonal composition of a breast cancer from multiple tissue samples Habil Zare Department of Genome Sciences University of Washington 19 Dec 2013.

31

Inference

C

Generate

Generating synthetic data

Random parameters

compare

Page 32: Inferring clonal composition of a breast cancer from multiple tissue samples Habil Zare Department of Genome Sciences University of Washington 19 Dec 2013.

32

• Genotype error: The frequency of false entries in the genotype matrix Z

• Clone frequency error: The average error in entries of the frequency matrix P

Defining accuracy criteria

Page 33: Inferring clonal composition of a breast cancer from multiple tissue samples Habil Zare Department of Genome Sciences University of Washington 19 Dec 2013.

33

Simulation shows genotype error decreasing with increasing samples

Page 34: Inferring clonal composition of a breast cancer from multiple tissue samples Habil Zare Department of Genome Sciences University of Washington 19 Dec 2013.

34

Simulation shows genotype error decreasing with increasing samples

Page 35: Inferring clonal composition of a breast cancer from multiple tissue samples Habil Zare Department of Genome Sciences University of Washington 19 Dec 2013.

35

Clone frequency error shows a similar trend

Page 36: Inferring clonal composition of a breast cancer from multiple tissue samples Habil Zare Department of Genome Sciences University of Washington 19 Dec 2013.

36

M1

P1

P3

P2

Experiment with real dataStudy on a primary breast cancer

• 10 breast tumor samples• 1 adjacent normal • 2 samples from the

metastatic lymph node

Page 37: Inferring clonal composition of a breast cancer from multiple tissue samples Habil Zare Department of Genome Sciences University of Washington 19 Dec 2013.

37

Clone frequencies vary smoothly across the tumor sections

The model doesn’t know anything about the anatomic location of the samples!

Page 38: Inferring clonal composition of a breast cancer from multiple tissue samples Habil Zare Department of Genome Sciences University of Washington 19 Dec 2013.

38

Clone frequencies vary smoothly across the tumor sections

Page 39: Inferring clonal composition of a breast cancer from multiple tissue samples Habil Zare Department of Genome Sciences University of Washington 19 Dec 2013.

39

Phylogenetic analysis tells the story of the tumor over time

Page 40: Inferring clonal composition of a breast cancer from multiple tissue samples Habil Zare Department of Genome Sciences University of Washington 19 Dec 2013.

40

Five clone solution

Page 41: Inferring clonal composition of a breast cancer from multiple tissue samples Habil Zare Department of Genome Sciences University of Washington 19 Dec 2013.

41

Six clone solution is consistent with five-

clone solution

Page 42: Inferring clonal composition of a breast cancer from multiple tissue samples Habil Zare Department of Genome Sciences University of Washington 19 Dec 2013.

42

Next-Gen Sequencing Data

Oncologists

Clonal structureEM Validated by simulations

Anatomic variation of clones Phylogenetic trees

Overview of the projectInferring clonal composition of a breast cancer from multiple tissue samples

Page 43: Inferring clonal composition of a breast cancer from multiple tissue samples Habil Zare Department of Genome Sciences University of Washington 19 Dec 2013.

43

Software publicly available

Page 44: Inferring clonal composition of a breast cancer from multiple tissue samples Habil Zare Department of Genome Sciences University of Washington 19 Dec 2013.

44

Supplementary slides

Page 45: Inferring clonal composition of a breast cancer from multiple tissue samples Habil Zare Department of Genome Sciences University of Washington 19 Dec 2013.

Proposed project based on former experiences:Identifying clonal decomposition using sub-tissues

SamSPECTRAL

Sort cell populations

Next Gen Sequencing

Next Gen Sequencing

Next Gen Sequencing

Next Gen Sequencing

Leukemia or lymphoma sample

Clonalanalysis