Coding and noncoding mutations across cancer...
Transcript of Coding and noncoding mutations across cancer...
Coding and noncoding mutations across cancer types
Ekta Khurana, PhD Assistant Professor
Meyer Cancer Center, Institute for Computational Biomedicine & Institute for Precision Medicine
Department of Physiology and Biophysics Weill Cornell Medicine, New York, NY
1
@ekta_khurana
2
First cancer WGS,
Ley, Mardis, et al., Nature, 2008
~500 cancer WGS Alexandrov, et al.,
Nature, 2013
~2,800 WGS from ICGC/TCGA
2014
Number of cancer whole genomes sequenced
Genomic variants identified from sequencing
3
ATGAACTGCAATTTCCAGAAGCATGCACCCTTGGAAG - - - TCTA
ATGAACTGCAAATTCCAGAAGCATG - - - - CTTGGAAGAGTTCTASNP
Human Ref.
Dele=on Inser=on Small Indels < 50 bp
Inser=ons
Dele=ons
Duplica=ons Inversion
Large structural variants
Human Ref.
An average human genome contains ~4 million inherited variants and a tumor genome contains thousands of somaMc
variants.
4
Khurana et al, Nature Rev Genet, 2016
IdenMfying mutaMons associated with cancer
Signals of positive selection Mutation frequency of cancer genes
6
Source: Interna=onal Cancer Genome Consor=um (dcc.icgc.org)
Computational methods to predict functional impact of missense mutations
Gnad et al, BMC Genomics, 2013
8
Most variants are
in noncoding
regions
Khurana et al, Nature Rev Genet, 2016
MB: medulloblastoma DLBC: B cell lymphoma STAD: gastric BRCA: breast PAAD: pancrea=c PRAD: prostate LIHC: liver PA: pilocy=c Astrocytoma LUAD: Lung adenocarcinoma
9
CGGAGG
CGGAAG mRNA
Gain-of-motif TF
TF 0.0
1.0
2.0WT
Mutated
Loss-of-motif
Loss-of-motif
TATCTAT X
TF
TATTTAT TF
WT
Mutated
T0.0
1.0
2.0
CGT
ATG
TGCA5G
CTA
CTG
ATT10AG
TGA
G
ACT
CGAT
ACT15GTA
C
Altered binding effects 2.0
Promoter Gene
X
• MYB mo=f created & drives TAL1
overexpression in T-‐ALL (Mansour et al, Science, 2014)
Noncoding mutations can be significant drivers Transcription factor (TF) binding disruption
TERT promoter mutated in many different cancer types
Killela et al, PNAS, 2013 Horn et al, Science, 2013 Huang et al, Science, 2013
Noncoding elements in the genome
10
Khurana et al, Nature Rev Genet, 2016
11
Khurana et al, Nature Rev Genet, 2016
Noncoding variants act via tissue-specific regulatory networks
Need to account for heterogeneity of mutation rate in
cancer cells when identifying drivers
• Histone modification marks
• DNase I hypersensitive sites
• Replication timing
Polak et al. Nature 518, 360-364 (2015)
12
Co-variates of mutation rates: Increased mutation density at TF binding sites in melanoma and lung
cancer
13
Perera et al, Nature, 2016 Sabarinathan et al, Nature, 2016 Khurana, Nature News & Views, 2016
Cuykendall et al, COISB, 2017
Khurana et al, Nature Rev Genet, 2016
IdenMfying mutaMons associated with cancer
FunSeq
15
EvoluMonary conservaMon -‐ Typically defined by comparison across species
ConservaMon among humans -‐ Deple=on of common variants/Enrichment of rare variants
1
2
4
3
Estimating negative selection
Common variant Rare variants
Frac=on of rare variants = (Num of rare variants/ Total num of variants)
Enrichment of rare SNPs as a metric for negative selection
16
0.55
0.60
0.65
0.70
0.4
0.5
0.6
0.7
0.8
0.9
A
Frac
tion
of ra
re S
NPs
(non
syn)
All C
odin
g
LOF-
tol.
Rece
ssive
GW
AS
Dom
inan
t
Esse
ntia
l
Canc
er
B
Frac
tion
of ra
re S
NPs
DHS Coding
Bone
Urot
heliu
mEm
br s
tem
cel
lBr
east
Bloo
dPa
ncr d
uct
Panc
reas
Skin
Eye
Live
rM
uscle
Bloo
d ve
ssel
Hear
tLu
ngBl
astu
laM
yom
etriu
mEp
ithel
ium
Pros
tate
Brai
nEm
br lu
ngG
ingi
val
Kidn
eySp
inal
cor
dFo
resk
inCo
nnec
tive
Adip
ose
Live
rCe
rvix
Trac
hea
Sple
enPr
osta
teEs
opha
gus
Lung
Test
isPl
acen
taBl
adde
rCo
lon
Hear
tTh
yroi
dTh
ymus
Kidn
eyBr
ain
Ova
ry
(rare=derived allele freq < 0.5%)
• Depletion of common polymorphisms in regions under selection
Negative selection restricts the allele frequency of deleterious mutations.
• Results for coding genes consistent with known phenotypic impacts
• Other metrics for selection • Evolu=onary conserva=on
(e.g. GERP) • SNP density
(confounded by muta=on rate)
LOF-‐tol (Loss-‐of-‐funcMon tolerant): least negaMve selecMon Cancer: most selecMon Khurana et al., Science, 2013
Organism-level negative selection in noncoding elements
17 Khurana et al., Science, 2013
Which noncoding categories are under very strong “coding-like” selection ?
q Top categories among ranked 102 categories
q Binding peaks of some general TFs (eg FAM48A)
q Core mo=fs of some TF families (eg JUN, GATA)
q DHS sites in spinal cord and connec=ve =ssue
~0.4% genomic coverage (~ top 25)
~0.02% genomic coverage (top 5)
~400-‐fold
~40-‐fold
Enrichment of know disease-‐causing muta=ons from Human Gene Muta=on
database
Khurana et al., Science, 2013 18
19
Identification of noncoding mutations with high impact: FunSeq
FunSeq2: Feature weight - Weighted with mutation patterns in natural polymorphisms
(features frequently observed weighed less) - entropy based method
!! = 1+ !!!"#!!! + 1− !! !"#! 1− !! !!!
!!"#$%! = ! !!!!!!!"!!"#$%&$'!!"#$%&"'!
20
HOT region
Sensi=ve region
Polymorphisms
Genome
p = probability of the feature overlapping natural polymorphisms
Feature weight:
For a variant:
wdp
Fu et al., Genome Biology, 2014 hgps://github.com/khuranalab/FunSeq_PCAWG
IdenMfying noncoding variants
associated with cancer
CNCDriver FunSeq
22
CNCDriver for detecting driver coding & noncoding elements
CNCDriver results in lung cancer (n=84)
23
Functional validation of candidates in prostate cancer
WDR74 promoter
q Sanger sequencing in 19 additional samples confirms the recurrence
q WDR74 shows increased expression in tumor samples
benign
PCa
24
RET promoterIncreased activity
EIF4EBP3 promoterReduced activity
25
International Cancer Genome Consortium & The Cancer Genome Atlas
~2800 WGS (tumor & normal), ~1500 RNA-‐Seq, ~1400 methylaMon
26
Acknowledgements
Yale Yao Fu (now at Bina), Xinmeng Mu (now at
Broad), Jieming Chen, Lucas Lochovsky, Arif Harmanci, Alexej Abyzov,
Suganthi Balasubramanian, Cristina Sisu, Declan Clarke, Mike Wilson, Yong Kong, Mark
Gerstein
Sanger Vincenza Colonna, Yuan Chen, Yali Xue, Chris
Tyler-Smith
Cornell Steven Lipkin, Jishnu Das, Robert Fragoza,
Xiaomu Wei, Haiyuan Yu
Andrea Sboner, Dimple Chakravarty, Naoki Kitabayashi, Vaja Liluashvili,
Zeynep H. Gümüş, Kellie Cotter, Mark A. Rubin
U of Michigan Hyun Min Kang U of Geneva
Tuuli Lappalainen (NYGC), Emmanouil T. Dermitzakis
Baylor Daniel Challis, Uday Evani, Donna
Muzny, Fuli Yu, Richard Gibbs EBI
Kathryn Beal, Laura Clarke, Fiona Cunningham, Paul Flicek, Javier Herrero, Graham R. S. Ritchie
Boston College Erik Garrison, Gabor Marth
Mass Gen Hospital Kasper Lage, Daniel G. MacArthur,
Tune H. Pers Rutgers
Jeffrey A. Rosenfeld
FuncMonal InterpretaMon
Group ~50 par=cipants
~40 Ins=tutes ~550 par=cipants
Khurana lab Eric Minwei Liu Priyanka Dhingra Alexander Fundichely Tawny Cuykendall Andre Forbes Mark Rubin Dimple Chakravarty Kellie Cotter David Rickman Adeline Berger Andrea Sboner Deli Liu Steve Lipkin PCAWG (ICGC/TCGA) collaboration (~100)
Acknowledgements