Dr. Stephen Chanock: Genome-wide Association Studies
-
Upload
national-cancer-institute-national-cancer-informatics-program -
Category
Technology
-
view
1.275 -
download
0
description
Transcript of Dr. Stephen Chanock: Genome-wide Association Studies
Genome-wide Association Studies
in Cancer:
A Step in the Right Direction
Stephen Chanock, M.D.
Chief, Laboratory of Translational Genomics Director, Core Genotyping Facility
June 20, 2012
BRCA1
BRCA2 TP53
PTEN CHEK2
ATM
PALB2 BRIP1
RAD51C
ERCC2
> Doubling in
2012
1
1.1
1 0 0.1 0.9
3
10
1.2
1.3
1.4
1.5
0.2 0.3 0.4 0.5 0.6 0.7 0.8
Po
pu
lati
on
gen
oty
pe
rela
tive r
isk
Population risk-allele frequency
Genetic Predisposition to Breast Cancer European Population
BCAC CGEMS/BCAC WTCCC Other
ATM PTCH
FLCN
BLM
BRCA1 BRCA2
T
MLH1
MSH2
PMS2
MSH6 HRAS
PTEN
DKC1
TERC
TERT
TINF2
WRAP53 TP53
NF1
EXT1
EXT2
FANCA
FANCB
FANCC
FANCD2
FANCE FANCF
FANCG
FANCI
FANCL
FANCJ
FANCM
FANCN/BRIP
1
RAD51C
TSC2
CDH1
KIT PDGFRA
HRPT2
RUNX1 CHEK2
CDKN2
A
CDK4
BUB1B
MEN1 RET
ALK
PHOXB2
NF2
NBS1
SDHD SDHC
SDHB
STK11
APC
BMPR1
A
SMAD4
MUTYH
FH
MET
RB1
SMARCB1
RECQL4
GPC3
TSC1
ERCC4
VHL
WRN
WT1
XPA ERCC3
XPC
ERCC2
DDB2
SLX4
ERCC5
POLH
GATA2
CEBPA
SDHA
SDHAF2
TMEM127
MAX
DICER1
BAP1
ELANE
HAX1
CBL
PTPN11
MITF
GALNT1
2
HOXB13
CYLD
EGFR
SUFO DIS3L2
SBDS
C Kratz
97 Genes Mutated in Cancer Susceptibility Syndromes
Genome-Wide Association Studies: Age of Discovery
• Discovery of New Regions in the Genome Associated with Diseases/Traits
• New “Candidate Genes & Regions”
• Clues for Mechanistic Insights Into the Contribution of Common Genetic Variation to Cancer Biology
• Challenge of Genetic Markers for Risk Prediction
• Individual Risk
• Public Health Decisions
• Polygenic Risk Models
PSCA
BNC2
ADH1B
C20orf54
ALD
FOXE1
NKX2-1
6p21.33
CTBP2
11p15 THADA
EHBP1
ITGA6
3p11.2
EEFSEC PDLIM5
TET2
SLC22A3
JAZF1
LMTK2
8p21
8q24.21(x5)
MSMB/
NCOA4
17q24.3
19q13.2
KLK2/
KLK3
BIK
NUDT10/
NUDT11
48 Prostateate
C2orf43
5p15
FOXP4
GPRC6A
13q22
10p14
11q23.1
BMP4
GREM1
CDH1 SMAD7
RHPN2 BMP2
EIF3H
8q24.21
13 Colorectal
PRKD2 15q21.3
15q23 16q24.1
11q24.1
2q13
FARP2
IRF4
8q24.21
CDKN2A/
CDKN2B
RTEL1
PHLDB1
TERT
CDKN2A/
CDKN2B
CCDC26
7 Glioma
1p36
1q42
KIF1B
TERT/
CLPTM1L
7q32
7 Basal Cell Carcinoma
TYR
KR
ASIP
MC1R
CDKN2A/
CDKN2B
TYR PLCE1
ASIP
9 Melanoma
1q21.1
BARD1
6 Neuroblastoma
6p22
KLF5/
KLF12
ABO
1q32.1
4 Pancreas
CLPTM1L
ITGA9
7 Nasopharyngeal
GABBR1
HLA-F
HLA-A IKZF1
ARIDB5
CEBPE
CHRNA3/
CHRNA5
6p21.32
TERT/
CLPTM1L
6 Lung
SPRY4
BAK1 KIT
6 Testicular
GSTM1
deletion
11* Bladde7
TACC3
NAT2
PSCA
MYC
FGFR2
LSP1
TOX3 COX11/
STXBP4
RAD51L1
1p11.2
2q35
SLC4A7/
NEK10
5p12
5q11.2
ECHDC1/
RNF146
C6orf97/
ESR1
26* Breast
4 Pediatric Acute Lymphoblastic Leukemia 7 Esophageal Squamous
4 Thyroid 2 Non-Hodgkin 5 Ovary 3 Gastric
10 CLL
1 Locus marked by a CNV
Chung & Chanock
ZNF365
10p15.1
10q22.3
DMRT1
ATF7
9 Multiple
1q22
1 Liver
2q31
19p13
12q13
3q26
1q41 TP63
GATA3 REL
3 Hodgkins
11q13
22q13
CCNE1
2q37
HNF1B x 2
10q21.2 7q32
9q31.2
1q42.12
1q21.3
CASP8
ATM
MX2
LMO1
SLC14A1
5q11
6p21
21q22
13q12.125
22q12.25
5q13
3q13
EGFR
3 Ewing Sarcoma
TARDBP
ERG2
15q15
HSD17B12
LINC00340
DDX4
3p12.1
10q26
FAM111A
EPAS1
3 Kidney
12p
Another 85 coming soon… breast, bladder, kidney, lung, ovary, prostate, TCGT
2 Wilms
DDX1
DLG2
H2
T5
LG
Published Cancer GWAS Etiology Hits: 5.15.12
~240 Disease Loci marked by SNPs
2011
IP
.13
11.23
Lessons from GWAS Discovery
• P values matter- to protect against a sea of false positives
• Size matters
• Design “sort of” matters…….
• Mapping is required to explore each region prior to functional work
• Collaboration is central
Epidemiology meets Genetics to Discover Biology
Not for weak of heart or stomach…
Lessons from GWAS Discovery
• P values matter- to protect against a sea of false positives
• Size matters
• Design “sort of” matters…….
• Mapping is required to explore each region prior to functional work
• Collaboration is central- Can’t do it alone
Not for weak of heart or stomach…
CGF & Data Sharing
• Posted first public GWAS datasets for breast & prostate cancer in 2006
• Aggregate data removed in 2008 in response to NIH policy change
• Led development of standards for GWAS posting with dbGaP
• Contributed all DCEG GWAS datasets to dbGaP
• CGF was instrumental in addressing privacy issues with GWAS and other high-dimensional aggregate genomics data
• 11 scans currently listed on dbGaP
GWAS Regions for Testicular Cancer Point
Towards Alterations in a Common Pathway
High Estimate for Heritability Strong Familial Component MZ Twins= 75 X DZ Twins= 20-25 X Strongest Effects Observed in GWAS KITLG Heterozygote OR= 2.5, Homozygote > 5.0
GWAS Studies:
Just the Start……
“This is not the end. It is not even the beginning of the end. But it is, perhaps, the end of the beginning.”
Sir Winston Churchill @ Lord Mayor's Luncheon,
Mansion House following the victory at El Alameinin North Africa
London, 10 November 1942.
Prostate Cancer Risk Factors 2006
• Age
• Ethnic Background
• Family History
CTBP2
11p15 THADA
EHBP1
ITGA6
3p12.1
EEFSEC PDLIM5
TET2
SLC22A3
JAZF1
LMTK2
8p21
8q24.21
MSMB/
NCOA4
11q13
HNF1B
17q24.3
19q13.2
KLK2/
KLK3
BIK
NUDT10/
NUDT11
Prostate Cancer: 48 as of May 1, 2012
> 24 More to come
PSA or Prostate Cancer or both?? Type 2 Diabetes
3
5
10q11.23
5p15.33
13q22.1
FOXP4
2p24.1
RFX6
3p11
2q27.3
12q13
2
FAM111A
10q26
No Clear Evidence of Specific Loci
For Advanced Disease
Suggests Distinct Regions Influence
Etiology & Outcome
Prostate Cancer Risk Factors 2012
• Age
• Ethnic background
• Family history
• Multiple common alleles-
48 published & more coming….
Each common variant explains a small proportion of risk
Together 15%
Prediction is difficult,
Especially about the future.
Yogi Berra
Dan Quayle
Niels Bohr
Model Under
65
Over
75
FHx only 0.55 0.51
G only 0.66 0.60
G + FHx 0.68 0.60
PSA 0.87 0.84
P Kraft, S Lindstrom for the BPC3
Area Under the ROC
Curve
Discriminatory Power of Genetic Risk
Score for Prostate Cancer
Theoretical Limits of Risk Prediction
Crohn’s Disease
Sibling relative-risk=20-35
Common cancers (Br or Pr)
Sibling relative risk=2-3
Random
Using known loci
Using all estimated loci
Ideal (if we could explain all heritability)
Park et al., Nat Genet ,2010
Park et al Nature Genetics 2010
Different Diseases Display Distinct Architectures
Next Generation GWAS
• Distinct Populations – Prostate Cancer in Japan or in African Americans – Esophageal Cancer in China
• Meta-Analysis Yield Discoveries – Larger Scans – Large Scale Replication (iCOGS)
• Shift to lower MAF – New Arrays vs Low-Pass Sequence Coverage – Imputation
• Better for Mapping Regions
– Larger Sample Sizes Required
Distinct Differences in the
Underlying Genetic Architecture of
Different Cancers
Preliminary Phase of Estimating Differential
Contribution
Common Variants
Log additive Effects
oEpistatic Effects
oUncommon Variants
oRare/Familial Mutations
1
1.1
1 0 0.1 0.9
3
10 BRCA1
BRCA2 TP53
PTEN
ATM CHEK2
PALB2 BRIP1
RAD51C
ERCC2
1.2
1.3
1.4
1.5
0.2 0.3 0.4 0.5 0.6 0.7 0.8
Po
pu
lati
on
gen
oty
pe
rela
tive r
isk
Population risk-allele frequency
> Doubling in
2012
Genetic Predisposition to Breast Cancer European Population
BCAC CGEMS/BCAC WTCCC Other
Per
allele
rela
tive r
isk
Risk-allele frequency
1
1.2
1 0 0.05 0.95
3
5
1.4
0.50
????
BRCA2
Genomic Architecture of Prostate Cancer
Susceptibility Loci:
Effe
ct s
ize
1
100
0.001 0.01 0.1 0.5
5
10
DUSP12
NBPF23
HSD17B12
DDX4
NME7
LMO1
FLJ22536
BARD1
Additional polymorphisms to be discovered by GWAS
Additional mutations and rare polymorphisms to be
discovered by NGS
Discovered mutations
Yet to be discovered mutations
Discovered polymorphisms
Yet to be discovered polymorphisms
ALK
PHOX2B
TP53
SDHB
PTPN11
Genetic predisposition to neuroblastoma: Current status and future directions
Modified from Manolio, et al. Nature 2009 Allele Frequency IL31RA
Smoking….
Per
allele
rela
tive r
isk
Risk-allele frequency
1
1.2
1 0 0.05 0.95
3
5
1.4
0.50
????
Adenocarcinoma specific
Stronger in nonsmokers
Genomic Architecture of Lung Cancer
Susceptibility Loci
Non-
Protein
Coding
Bioinformatic
Analysis
Experimental
Strategy
Unannotated
Transcript
Regulatory
Element
Epigenetic
Alteration of
Gene Levels
Effect on
Genes
Elsewhere
Novel
Transcripts
Functional
Elements
In vitro/vivo
miRNA
RNASeq
Expression
Quantitative
Trait
Analysis
Histone
Methylation
Elements
Investigation of GWAS Markers
Functional
Elements
Coding
Variant
Effect on
Protein
Allelic
Differences
mRNA
Stability
Gene
Product
Functional
Analysis
Initial Findings
Expression
Quantitative
Trait
Analysis
10q11.2 & Prostate Cancer
Risk rs10993994 within promoter of
MSMB= b-microseminoprotein
Prostate specific serum marker under study
20,000 subjects
Risk Allele “T”
Lower expression levels
Reporter assays
Electrophoretic Mobility Shifts
Levels in Prostate Tissue
Tumor Tissue
Functional Analysis
10q11.2 Could Be More Complex……
MSMB and NCOA4
RNA Expression MSMB and NCOA4
Normal
Tumor Tissue
Anchorage Independent Growth is Specific to Prostate
MSMB- Suppression
NCOA4- Over-expression
Re-sequence
454 FLX
~100 subjects
Predicted fusion transcripts (UCSC)
Genomic organization of MSMB and NCOA4
Fusion transcripts identified by 5’ RACE
Lou et al in press Hum Genet 2012
Characterization of Chimeric Transcript at 10q11.23 MSMB-NCOA4 Fusion
RT-PCR of MSMB-NCOA4 fusion transcripts in human tissues and 9 prostate cancer cell lines Expression of MSMB-NCOA4 fusion protein in PC3 transfected cells Confirmed by Western Blot Immunoprecipitation anti-N (MSMB) & anti-C (NCOA4)
11q13: Multi-Cancer Susceptibility Region
Australia/ iCOGS Analysis J French
Discovery by GWAS & Mapping
CGEMS (Thomas NG 2008)
1-> 3 Loci (Chung et al HMG 2011)
GWAS
Admixture Analysis of
GWAS Data:
Use of Differences in
Allele Frequencies to
Map Regions that
Contribute to
Differences in
Incidence
Population Structure Analysis
Can 1000s of SNPs assist in
Identifying Individuals at
Higher Risk for Poor
Response to Pediatric ALL?
• GWAS Scan as a
Preliminary Biomarker
JJ Yang et al Nature Genetics 2011
African
Native
American European Asian
STRUCTURE
Genome-wide
association studies
Association testing
Biometrics Nutrient levels Behavioral traits
Height, Weight, BMI,
Menarche/Menopause
>200 regions & Height/Weight
Tobacco
Caffeine
Alcohol
Vitamins D, B12 Carotene
Opportunities for:
1. Pathway Analyses
2. Polygenic Model
GIANT CONSORTIUM
Loci Associated with BMI
Rethinking “Pathways”
Genome-wide association studies
Large chromosomal abnormalities, structural
variation, aneuploidy in germline DNA
Unexpected Findings
Rodriguez-Santiago AJHG 2010
Jacobs et al Nature Genetics 2012
Laurie et al Nature Genetics 2012
Genetic Mosaicism & the Aging Genome
Looking at LRR/BAF graphs
LRR – log ratio of observed probe intensity to expected intensity – Significant deviation from zero is evidence for copy number change
BAF – B allele frequency – ratio of B probe intensity to total intensity - Expected values for diploid loci are 0, ½ and 1. Other values can indicate allelic imbalance and suggest copy number changes or mosaicism.
B-allele
Homozygous
A-allele
Homozygous
Heterozygous
σLRR=0.24 σBAF=0.044
Definition of Genetic Mosaicism
Co-existence of distinct subpopulations of cells regardless of the clonal or developmental origin
Presence of large structural genomic events (> 2 Mb)
Resulting in alteration of • Copy number (gain or loss)
• Loss of heterozygosity
Mosaicism in the Extreme
• Age-old explanation for developmental disorders and catastrophic diseases (NF) • Trisomy 21, Turners (XO)
• Rare, Highly Penetrant Mutations lead to Variegated Aneuploidy • BUB1B • CEP57
• Complex Syndromes • Proteus Syndrome & AKT1 (NEJM 2011)
• Ollier Disease & IDH1/IDH2 (Nature Genetics 2011)
• HRAS- Skin/Cancer (NEJM 2011)
Validation for 42 Events Observed
100%
Rodríguez-Santiago et al. Am J Hum Genet. 2010;87:129-38
1.7% Overall
No Association with Bladder Cancer Risk
Genetic Mosaicism of the Autosomes • Analysis of 13 GWAS 57,853
• Cancer cases 31,717
• Cancer-free controls 26,136
• Mosaic events detected 681
• Autosomal chromosomes 641
• Individuals 517
• Individuals with multiple events 69
Jacobs et. al. Nature Genetics 2012
-0.6
-0.5
-0.4
-0.3
-0.2
-0.1
0
0.1
0.2
0.3
0.4
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
log 2
inte
nsi
ty r
atio
(LR
R)
Proportion of abnormal cells (p)
gain
neutral LOH
loss
Genetic Mosaic Events
Age at DNA Collection is the Strongest
Predictor of Genetic Mosaicism
0.0%
0.5%
1.0%
1.5%
2.0%
2.5%
3.0%
<45 45‐49 50‐54 55‐59 60‐64 65‐69 70‐74 75‐79
Frequency
Age at DNA collection
Cancer Free Cancer DX
Jacobs et. al. Nature Genetics 2012
Females Males
Cancer Free 0.56% 0.87%
Cancer 0.79% 1.21%
Overall 0.65% 1.04%
Higher Frequency in Men Compared to Women
Frequency of Mosaic Events by Type & Location
gain loss cnloh mixed Total gain loss cnloh mixed Total
62 11 42 5 120 9.7 1.7 6.6 0.8 18.7
11 13 114 1 139 1.7 2.0 17.8 0.2 21.7
9 10 149 0 168 1.4 1.6 23.2 0.0 26.2
14 185 2 1 202 2.2 28.9 0.3 0.2 31.5
1 1 2 0 4 0.2 0.2 0.3 0.0 0.6
0 3 0 5 8 0.0 0.5 0.0 0.8 1.2
Total 97 223 309 12 641 15.1 34.8 48.2 1.9
interstitial
span centromere
complex
Mosaic Chromosome Count Mosaic Chromosome Frequency (%)
Event Location
chromosome
telomeric P
telomeric Q
Number of chromosomes with mosaic events • 69 individuals have two or more events
– 46 cancer cases
– 23 cancer-free
Count of mosaic
chromosomes
cancer Likely
Untreated Possibly Treated Total
cancer- free
Cancer type(s)
1 166 111 277 171
2 15 13 28 18
3 7 0 7 3
4 3 0 3 1
5 0 2 2 1
6 0 1 1 0 prostate
7 1 0 1 0 CLL & ovary
11 0 1 1 0 gastric
16 1 0 1 0 lung
20 0 2 2 0 gastric
Circos Plot of mosaic events in 57,583 individuals (681 events)
-0.6
-0.5
-0.4
-0.3
-0.2
-0.1
0
0.1
0.2
0.3
0.4
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
log 2
inte
nsi
ty r
atio
(LR
R)
Proportion of abnormal cells (p)
neutral LOH
gain
loss
Site of first cancer OR 95% CI p value OR 95% CI p value OR 95% CI p value
non-hematologic cancer 1.27 (1.05-1.52) 0.012 1.45 (1.18-1.80) 5.4E-04 1.03 (0.81-1.30) 0.804
bladder 1.30 (0.90-1.89) 0.164 1.50 (1.01-2.23) 0.043 0.73 (0.32-1.68) 0.455
breast 0.72 (0.41-1.27) 0.256 0.49 (0.18-1.32) 0.159 0.90 (0.46-1.79) 0.770
endometrium 1.27 (0.64-2.50) 0.494 1.35 (0.42-4.30) 0.611 1.24 (0.54-2.82) 0.610
esophagus 0.86 (0.34-2.18) 0.751 3.51 (0.45-27.58) 0.232 0.76 (0.29-2.03) 0.590
glioma 0.88 (0.45-1.74) 0.717 0.95 (0.44-2.05) 0.892 0.70 (0.17-2.86) 0.622
kidney 1.98 (1.27-3.06) 2.3E-03 2.32 (1.46-3.69) 3.6E-04 0.95 (0.30-3.03) 0.931
lung 1.56 (1.18-2.08) 2.0E-03 1.69 (1.23-2.33) 1.3E-03 1.27 (0.81-1.96) 0.295
osteosarcoma 1.34 (0.39-4.59) 0.637 1.34 (0.39-4.59) 0.637
ovary 1.18 (0.48-2.93) 0.718 1.09 (0.27-4.47) 0.903 1.27 (0.40-4.04) 0.690
pancreas 0.89 (0.60-1.33) 0.574 0.55 (0.14-2.24) 0.406 0.93 (0.62-1.41) 0.735
prostate 1.14 (0.79-1.64) 0.485 1.28 (0.85-1.92) 0.243 0.92 (0.51-1.66) 0.781
stomach 1.43 (0.68-3.03) 0.345 3.35 (0.74-15.13) 0.116 1.32 (0.61-2.88) 0.481
testis 3.29 (0.59-18.46) 0.176 3.29 (0.59-18.46) 0.176
other sites 1.49 (0.55-4.05) 0.438 1.49 (0.55-4.05) 0.438
All cancer cases Likely Untreated Possibly Treated
Adjusted analysis of association between mosaicism & cancer
Hematological Cancers and “Mosaicism” A Tale of Two Studies
NCI Study (Jacobs et al 2012)
43 Hematological cancers Higher frequency
15.8% Myeloid
26.3% CLL
Untreated leukemia vs. cancer-free controls
OR=35.4 (14.7-76.6 95% CI)
p=3.8×10-11
GENEVA (Laurie et al 2012)
Subanalysis in 4 cohorts for
Heme cancers
Hazard ratio estimate for mosaic status =10.1 (95% CI=5.8 - 17.7)
p=3 x 10-10
Jacobs et al., 2012 Laurie et. al., 2012
Chromosome Y Mosaic Aneuploidy
Pre-Correction Post-Correction
Example of whole chromosome loss in ~60% of cells • Uncorrected appears to be segmental loss • Corrected results are clearly whole chromosome loss
SRY RPS4Y1 ZFY
AMELY TBL1Y
PRKY
USP9Y DDX3Y UTY TMSB4Y
NLGN4Y
CYorf15A CYorf15B KDM5D
EIF1AY
qPCR Assay Panel for chrY loss • 15 probes designed to determine copy
number of single-copy genes relative to RNaseP
• Coverage across p- and q-arms
y = 0.8599x + 0.132 R² = 0.8121
0.00
0.20
0.40
0.60
0.80
1.00
1.20
1.40
0.00 0.20 0.40 0.60 0.80 1.00 1.20 1.40
Un
calib
rate
d Y
-lo
ss f
rom
In
fin
ium
LR
R
Y-loss from 15 qPCR Probes
Calling whole chromosome Y-loss from Infinium GWAS data calibrated using qPCR Ratios for 15 Regions
Probable Loss
Undetermined
No Loss
Possible Gain
The Aging Genome: Implications for Cancer Studies
• Importance of thorough characterization of ‘germline’ DNA in parallel with somatic analyses
• Insights into Genomic Instability
Early vs. Late Events
• Genetic biomarkers for early detection of hematological cancers
Two Hypotheses for Mosaicism in the Aging Genome
Early Event
Increase in Somatic Alterations with Age
PLUS Decreased Genomic Stability due to
Telomere Attrition
LEADS TO Proliferation of Suppressed
Populations of Somatically Altered Clones
Decreased Cellular Diversity with Aging and Cell Populations Become Increasingly Oligoclonal
Late Event
Embryonic Progenitors with Somatic Alterations Are Below Threshold of Detection
Unknown Events Trigger Survival
Bottleneck
LEADS TO Positive Selection with Rapid
Expansion of Second Clonal Population
Co-existence of Distinct Clonal Populations
Using Population Studies to Gain Biological Insights into Genetic Mosaicism:
• Map breakpoint sites (Mitelman/ENCODE)
• Analyze paired tumors in identified studies
• Investigate timing and dynamics (serial samples)
• Proportions & cell type
• Blood Bank Study at NIH
• Australian Twin Registry (2000)
• Confirm Non-hematological Association
• New lung/kidney studies
• Sex Chromosomes
• Y chromosome in TGS
• X has challenge of Lyonization
Detectable Genetic Mosaicism & Hematological Cancers
• PLCO heme cancers (> 700)
• Serial samples for ~25%
• NHL GWAS
• 9000 Cases (5 subtypes)
• Aplastic Anemia (NHLBI- N Young)
• 20% MDS/Leukemia
• CLL/Waldenstrom families (GEB)
• MBL/MGUS serial samples
• CLL Study (L Staudt)
Milestones at the Core Genotyping Facility
2001 2002 2004 2006 2008 2010 2012 2014 & beyond
Candidate SNP
Functional Data
Candidate Genes
Biological Plausibility
Genetic Markers
Candidate Pathway
Biological Plausibility
Genome Wide
Association
Studies
Regional Sequencing
GWAS & Linkage
Exome
Sequencing
Whole Genome
Sequencing
Population-based
Sequencing
Human Genetics: Thresholds and Significance
• Each new technology has brought us to a new ‘crisis’
Linkage and LOD scores in too few families
Candidate Gene/SNPS in small studies
GWAS in insufficiently powered studies
Required replication/validation
CNVs and power outages together with unstable calling algorithms
Exome sequencing, oligogenic models with insufficient number of families
Whole genome sequencing and the rising tide of ‘uninterpretable’ variants…
Let’s not fool ourselves about germline genetics…
• First it is about……..Discovery
Biology
Targets
Possible risk variants
• Then…………………….Validation
• Followed by…………Characterization
• Later…………………….Clinical application
• Distant future………Targets
Don’t forget lifestyle/environment…..
Mapping Genetic Architecture
• Comprehensive map will emerge across spectrum of variation
• GWAS Common Variants
• Sequencing Uncommon and Rare Variants
• Search for “Pathways”
• Moderate penetrance breast cancer genes
• Develop Risk Profiles in Age of Sequencing
• Can Prevention or Early Detection Improve Outcome?
In the not too distant future
• We will look back and think of GWAS as the “Golden Age”
• The temptation is to continue to do GWAS
– Should not be abandoned (esp with imputation)
– Think bigger in size for comprehensively exploring MAF spaces
• > 10%
• 1-10%
• < 1% Can it be done by association testing??
But the allure of sequencing is at hand…..
Emerging Impact of Population Genetics in the Search for ‘Elusive’ Variants
• In GWAS era- we tolerated minor mismatching because sample sizes and MAFs are large
• As MAFs fall, challenge of population private variants ‘sky-rockets’
• How do we know if a rare variant in one population is non-contributory whereas in a second population it influences risk/outcome?
• Value of functional validation
• Challenge of Environmental Exposure/Lifestyle
Advances will be accelerated by
“Collective Intelligence”
“I not only use all of the brains
I have, but all I can borrow”
Woodrow Wilson
Acknowledgements
NCI-DCEG Joseph Fraumeni
Peggy Tucker
Gilles Thomas
Robert Hoover
Meredith Yeager
Kevin Jacobs
Sharon Savage
Nilanjan Chatterjee
Nat Rothman
JuHyun Park
Sonja Berndt
Sharon Savage
Lindsay Morton
Zhaoming Wang
NCI-CCR Mike Dean
Hong Lou
Institut Curie Olivier Delattre
Carlo Lucchesi
HSPH David Hunter
Pete Kraft
Sara Lindstrom
BPC3 & CGEMS ACS (M Thun)
ATBC (D Albanes-DCEG)
CAPS (H Gronberg/J Xu)
CeRePP (O Cussenot)
CONOR (L Vatten)
EPIC (E Riboli)
JHU (W Issacs/J Xu)
MEC (B Henderson)
PLCO (R Hayes)
WHI (R Prentiss)
DFCI Matt Freedman
Mark Pomerantz
LTG Renee Chen
Charles Chung
Jean- Nicolas Cornu
Jun Fang
Phoebe Lee
Lea Jessop
Hye Kim
Joe Kovacs
Tim Myers
Nilabja Sikdar
Strategic Support MJ Horner
Tammy Bell
Investigators Laufey Amundadottir
Kevin Brown
Mila Prokunina-Olsson
Acknowledgements 189 authors from 48 participating studies:
Kevin Jacobs
Meredith Yeager
Margaret Tucker
Nathaniel Rothman
Sholom Wacholder
Luis Perez-Jurado
Joseph Fraumeni
African-American Lung Cancer Consortium Krista Zanetti (NCI), Melinda Aldrich, Chris Amos,
William Blot, Cathryn Bock, Elizabeth Gillanders, Curt
Harris, Chris Haiman, Brian Henderson, Laurence
Kolonel, Loic Le Marchand, Lorna McNeill, Benjamin
Rybicki, Ann Schwartz, Lisa Signorello, Margaret
Spitz, John Wiencke, Margaret Wrensch, Xifeng Wu
Lung Neal Caporaso, Teresa Landi, Lynn Goldin, Dario
Consonni, Pier Alberto Bertazzi, Melissa Rotunno
Glioma Preetha Rajamaran (NCI,) Laura Beane Freeman (NCI), Christine
Berg (NCI), Julie Buring, Ulrika Andersson, Mary Butler, Tania
Carreon, Maria Feychting, Anders Ahlbomm J Michael Gaziano,
Graham Giles, Goran Hallmans, Wei Zheng, Susan E Hankinson,
Roger Henriksson, Peter D Inskip, Christoffer Johansen
Annelie Landgren, Roberta McKean-Cowdin, Dominique
Michaud, Beatrice Melin, Ulrike Peters, Avima Ruder, Howard
Sesso, Gianluca Severi, Xiao-Ou Shu, Kala Visvanathan, Emily
White, Alicja Wolk, Anne Zeleniuch-Jacquotte,
Wei Zheng, Manolis Kogevinas
Osteosarcoma Sharon Savage, Irene Andrulis,
Jay Wunder, Ana Patiao-Garcia,
Luis Sierrasesumaga, Donald A
Barkauskas, Richard Gorlick
Testis, Ovary &
Endometrium Christian Kratz, Katherine
McGlynn, Mark Greene,
Michael Cook, Barry
Graubard, Ralph Erickson,
Nicolas Wentzensen
Bladder Montse Garcia-Closas, Debra
Silverman, B. Rodriguez-Santiago,
Nuria Malats,, Francisco Real, Jonine
Figueroa, Ludmila Prokunina-Olsson,
Dalsu Baris, Gaelle Marenne, Manolis
Kogevinas, Molly Schwenn, Alison
Johnson
PanScan Patricia Hartge, Laufey Amundadottir, Rachael
Stolzenberg-Solomon (NCI), Demetrius Albanes (NCI),
Jarmo Virtamo, Zhaoming Wang, Amy Hutchinson, Alan
A Arslan, H Bas Bueno-de-Mesquita, Charles Fuchs,
Steven Gallinger, Myron D Gross, Elizabeth Holly, Alison
Klein, Andrea LaCroix, Margaret Mandelson, Gloria
Petersen, Marie-Christine Boutron-Ruault, Paige M
Bracci, Federico Canzian, Kenneth Chang,Michelle
Cottercho, Ed Giovannucci, Michael Goggins, Judith
Hoffman Bolton, Mazda Jenab, Kay-Tee Khaw, Vittorio
Krogh, Robert Kurtz, Robert McWilliams, Julie B
Mendelsohn, Kari Rabe Elio Riboli, Anne Tjonneland,
Geof Tobias, Dimitrios Trichopoulos, Joanne Elena,
Herbert Yu, Fredrick Shumacher, Daniel Stram, Lisa
Mirabello, Juan R Gonzalez, Olaya Villa, Donghui Li, Eric
J Duell, Harvey A Risch, Sara H Olson, Charles
Kooperberg, Brian M Wolpin, Li Jiao, Manal Hassan,
William Wheeler
Upper GI Christian Abnett, Alisa
Goldstein, Phil Taylor,
Neal Freedman, Linda
Liao, Ti Ding, You-Lin
Qiao, Yu-Tang Gao,
Woon-Puay Koh, Yong-
Bing Xiang, Ze-Zhong
Tang, Jin-Hu Fan, Jian-
Min Yuan
Breast (CGEMS) David Hunter
Peter Kraft
Louise A Brinton,
Jolanta Lissowska,
Beata Peplonska
Regina Ziegler
Prostate Cancer (CGEMS) Robert Hoover, Gilles Thomas,
Sonja Berndt, Weiyin Zhou, Xiang
Deng, Chenwei Liu, Michael Cullen,
Ann Hsing, Caroline Epstein, Laurie
Burdett, Nilanjan Chatterjee, Joshua
Sampson, Amanda Black, Michael
Dean, Charles, Chung, Joseph
Kovaks, Nan Hu, Kai Yu, MJ Horner
American Cancer Society Susan Gapstur, Victoria Stevens,
Lauren Teras, Mia Gaudet
Renal Mark Purdue, Wong-
Ho Chow, Lee E
Moore, Kendra
Schwartz, Faith Davis
Investigators:
Cathy C. Laurie, Cecelia A. Laurie, Kenneth Rice, Kimberly F. Doheny, Leila R. Zelnick, Caitlin P.
McHugh, Hua Ling, Kurt N. Hetrick, Elizabeth W. Pugh, Chris Amos, Qingyi Wei, Li-e Wang,
Jeffrey E. Lee, Kathleen C. Barnes, Nadia N. Hansel, Rasika Mathias, Denise Daley, Terri H.
Beaty, Alan F. Scott, Ingo Ruczinski, Rob B. Scharpf, Laura J. Bierut, Sarah M. Hartz, Maria Teresa
Landi, Neal D. Freedman, Lynn R. Goldin, David Ginsburg, Jun Li, Karl C. Desch, Sara S. Strom,
William J. Blot, Lisa B. Signorello, Sue A. Ingles, Stephen J. Chanock, Sonja I. Berndt, Loic Le
Marchand, Brian E. Henderson, Kristine R Monroe, John A. Heit, Mariza de Andrade, Sebastian M.
Armasu, Cynthia Regnier, William L. Lowe, M. Geoffrey Hayes, Mary L. Marazita, Eleanor
Feingold, Jeffrey C. Murray, Mads Melbye, Bjarke Feenstra, Jae Hee Kang, Janey L. Wiggs, Gail
Jarvik, Andrew N. McDavid, Venkatraman E. Seshan, Daniel B. Mirel, Andrew Crenshaw, Nataliya
Sharopova, Anastasia Wise, Jess Shen, David R. Crosslin, David M. Levine, Xiuwen Zheng,
Jenna I Udren, Siiri Bennett, Sarah C. Nelson, Stephanie M. Gogarten, Matthew P. Conomos,
Patrick Heagerty, Teri Manolio, Louis R. Pasquale, Christopher A. Haiman, Neil Caporaso, Bruce
S. Weir
61
Chromosomal mosaicism from birth to old age and its relationship to
hematological cancer
Mosaic Non-mosaic
Event 15 90
No event 134 8,323
Cox proportional hazards regression: Stratified analysis of the four cohorts Time to event ~ Age + non-hematological cancer status (time-dependent) + ethnicity + sex (within the PLCO stratum) The hazard ratio estimate for mosaic status is 10.1 (95% CI=5.8 - 17.7) and the p-value is 3 x 10-10
Meta-analysis gave very similar estimate.
Do mosaic subjects have increased incidence of hematological cancer?
360,214 DNA profiles (Identifiler)
5.7 x 10-6 STRs
7.6 x 1013 SNP genotypes
150 Gbps regional sequencing
80 whole genomes + 80 on the way
328 samples whole-exome sequencing
CGF Metrics