The linkage disequilibrium architecture of human complex ......Steven Gazal Alkes Price lab -...
Transcript of The linkage disequilibrium architecture of human complex ......Steven Gazal Alkes Price lab -...
-
Steven Gazal
Alkes Price lab - Harvard School of Public Health / Broad Institute
Genomics of common diseases - 09/25/2016
The linkage disequilibrium dependent architecture of human complex traits
http://www.ebi.ac.uk/birney-srv/medaka-ref-panel/viz.htmlhttps://www.google.com/imgres?imgurl=http://cdn2.hubspot.net/hubfs/435333/world-architecture-festival-01.jpg?t=1463423388041&imgrefurl=http://www.stonerbunting.com/blog/topic/architecture&docid=POYfIncxMcpkPM&tbnid=Rk2Gs8mpnq6W-M:&w=3200&h=2133&bih=606&biw=1029&ved=0ahUKEwjGh47g8ZjNAhVBOyYKHfq0C2kQMwhhKA4wDg&iact=mrc&uact=8
-
2
What does “architecture” mean?
• How heritable is the trait (h2)?
• How many causal variants are there?
• How are causal variants distributed across: - common vs. rare variants (hg
2 vs. h2) - MAF spectrum: MAF-dependent architecture - functional annotations (DHS, H3K27ac, conserved, etc.) …
https://www.google.com/imgres?imgurl=http://cdn2.hubspot.net/hubfs/435333/world-architecture-festival-01.jpg?t=1463423388041&imgrefurl=http://www.stonerbunting.com/blog/topic/architecture&docid=POYfIncxMcpkPM&tbnid=Rk2Gs8mpnq6W-M:&w=3200&h=2133&bih=606&biw=1029&ved=0ahUKEwjGh47g8ZjNAhVBOyYKHfq0C2kQMwhhKA4wDg&iact=mrc&uact=8
-
3
What does “LD-dependent architecture” mean?
• LD-dependent architecture: dependence of causal effect sizes on the level of LD of a SNP after conditioning on MAF
http://www.ebi.ac.uk/birney-srv/medaka-ref-panel/viz.htmlhttps://www.google.com/imgres?imgurl=http://cdn2.hubspot.net/hubfs/435333/world-architecture-festival-01.jpg?t=1463423388041&imgrefurl=http://www.stonerbunting.com/blog/topic/architecture&docid=POYfIncxMcpkPM&tbnid=Rk2Gs8mpnq6W-M:&w=3200&h=2133&bih=606&biw=1029&ved=0ahUKEwjGh47g8ZjNAhVBOyYKHfq0C2kQMwhhKA4wDg&iact=mrc&uact=8
-
4
Evidence of LD-dependent architecture
SNPs in regions with low level of LD have higher effect sizes
• DNase I hypersensitive site (DHS) regions (Gusev et al. 2014 AJHG)
• Functional elements such histone marks (Finucane et al. 2015 Nat Genet)
• Regions with high GC-content (Loh et al. 2015 Nat Genet)
-
5
Evidence of LD-dependent architecture
SNPs in regions with low level of LD have higher effect sizes
• DNase I hypersensitive site (DHS) regions (Gusev et al. 2014 AJHG)
• Functional elements such histone marks (Finucane et al. 2015 Nat Genet)
• Regions with high GC-content (Loh et al. 2015 Nat Genet)
More disease variants in coding regions with lower recombination rate (where high level of LD is expected) (Hussin et al. 2015 Nat Genet)
-
Evidence of LD-dependent architecture
SNPs in regions with low level of LD have higher effect sizes
• DNase I hypersensitive site (DHS) regions (Gusev et al. 2014 AJHG)
• Functional elements such histone marks (Finucane et al. 2015 Nat Genet)
• Regions with high GC-content (Loh et al. 2015 Nat Genet)
More disease variants in coding regions with lower recombination rate (where high level of LD is expected) (Hussin et al. 2015 Nat Genet)
Fundamental as this architecture biases heritability estimations (Speed et al. 2012 AJHG, Gusev et al. Plos Genet 2013, Yang et al. Nat Genet 2015)
These discordant findings have never been formally assessed, quantified or biologically interpreted
-
1. Assessing the LD-dependent architecture of human complex traits
2. Understanding which processes shaping LD patterns are involved in human complex traits
7
W< < w< Outline The LD-dependent architecture of human complex traits
-
8
Coding
DHS
0
1
0
1
Recomb Rate
Level of LD (LLD)
Let’s consider Q continuous annotations
Stratified LD score regression (Finucane et al. 2015) Extension to continuous valued annotations
βj : Genotype effect size of SNP j aj,q : annotation of SNP j in category Qq τq : Effect size of category Qq N : number of individuals l(j,q) : LD score of SNP j in category Qq a : measure of confounding r2jk : correlation between SNP j and k
-
9
Coding
DHS
0
1
0
1
Recomb Rate
Level of LD (LLD)
Let’s consider Q continuous annotations
Stratified LD score regression (Finucane et al. 2015) Extension to continuous valued annotations
Per-standardized causal effect size
βj : Genotype effect size of SNP j aj,q : annotation of SNP j in category Qq τq : Effect size of category Qq N : number of individuals l(j,q) : LD score of SNP j in category Qq a : measure of confounding r2jk : correlation between SNP j and k
-
• Need to take into account the MAF-dependent architecture – Common SNPs have larger causal effect sizes
– Common SNPs have larger LD scores
• Level of LD (LLD) measure: LD score MAF corrected through MAF-stratified quantile normalization
• We started from the following model
10
Measuring the level of LD (LLD) of a SNP
-
• Publicly available summary statistics (29)
• 23andMe, Inc. summary statistics (19)
• UK Biobank data (15)
56 traits in total (average N = 101,420) Meta-analysis over 31 independent traits (average N = 84,719)
11
56 traits analyzed (summary statistics only)
-
12 Meta-analysis: P = 1.88 x 10-94
SNPs with low LLD explain more heritability
Impact of level of LD (LLD) on genetic architecture for 20 highly heritable traits
LL
D e
ffect
siz
e (
t*)
−0.8
−0.6
−0.4
−0.2
0.0A
ge a
t M
enarc
he
An
ore
xia
Autism
Spectr
um
Blo
od P
ressure
(D
iasto
lic)
BM
I
Celia
c D
isease
Cro
hn's
Dis
ease
Cup
Siz
e
Hair C
url
He
el T−
Score
Heig
ht
IQB
Gra
y A
moun
t
Lung F
EV
1/F
VC
ratio
Lung
FV
C
Male
Pattern
Pri
mary
Bili
ary
Cirrh
osis
Rheu
mato
id A
rthri
tis
Sch
izoph
ren
ia
Shoe
Siz
e
Unib
row
LL
D e
ffect
siz
e (
t*)
Negative effects across the 56 traits investigated!
-
1. Assessing the LD-dependent architecture of human complex traits
2. Understanding which processes shaping LD patterns are involved in human complex traits
13
W< < w< Outline The LD-dependent architecture of human complex traits
-
Many processes correlate with LLD
• Biological mechanisms and background selection – Predicted allele age (ARGweaver, Rasmussen et al. 2014 Plos Genet)
– LLD in Africans (LLD-AFR)
– Recombination rates (HapMap 2 recombination map)
– Nucleotide diversity (AC ≥ 5)
– Background selection statistic (McVicker et al. 2009 Plos Genet)
– CpG-Content
• Baseline model of Finucane et al. (2015 Nat Genet) – 58 annotations – Coding regions, Enhancer, Promoter, Histone marks, Conserved regions …
14 Annotations corrected via MAF-stratified quantile normalization
Many annotations correlated to LD could contribute to LD-dependent architectures
-
• Functional annotations correlate with low LD
• Recent mutations have low LLD! 15
Many annotations correlated to LD could contribute to LD-dependent architectures
Functional annotations from the baseline model (Finucane et al. 2015 Nat Genet)
LD-related annotations
-
16
Many LD-related annotations impact causal effect sizes
Annotation + MAF
LD−
rela
ted
an
no
tati
on
eff
ec
t siz
e (
t*)
−0.8
−0.6
−0.4
−0.2
0.0
0.2
0.4
0.6
LD−
rela
ted
an
no
tati
on
eff
ec
t siz
e (
t*)
−0.8
−0.6
−0.4
−0.2
0.0
0.2
0.4
0.6
LL
D*
LL
D−
D'*
LL
D−
RE
G*
Pre
dic
ted
All
ele
Ag
e*
LL
D−
AF
R*
Reco
mb
ina
tio
n R
ate
*
Nu
cle
oti
de
Div
ers
ity
*
Backg
rou
nd
Sele
cti
on
Sta
tis
tic
*
Cp
G−
Co
nte
nt*
a
LL
D*
LLD−
D'
LLD−
RE
G
Pre
dic
ted
All
ele
Ag
e*
LL
D−
AF
R*
Reco
mb
ina
tio
n R
ate
*
Nu
cle
oti
de
Div
ers
ity
*
Backg
rou
nd
Sele
cti
on
Sta
tis
tic
*
Cp
G−
Co
nte
nt*
b
Pre
dic
ted
All
ele
Ag
e*
LL
D−
AF
R*
Reco
mb
ina
tio
n R
ate
*
Nu
cle
oti
de
Div
ers
ity
*
Backg
rou
nd
Sele
cti
on
Sta
tis
tic
*
Cp
G−
Co
nte
nt*
c
LD−
rela
ted
an
no
tati
on
eff
ec
t siz
e (
t*)
−0.8
−0.6
−0.4
−0.2
0.0
0.2
0.4
0.6
LD−
rela
ted
an
no
tati
on
eff
ec
t siz
e (
t*)
−0.8
−0.6
−0.4
−0.2
0.0
0.2
0.4
0.6
LL
D*
LL
D−
D'*
LL
D−
RE
G*
Pre
dic
ted
All
ele
Ag
e*
LL
D−
AF
R*
Reco
mb
ina
tio
n R
ate
*
Nu
cle
oti
de
Div
ers
ity
*
Backg
rou
nd
Sele
cti
on
Sta
tis
tic
*
Cp
G−
Co
nte
nt*
a
LL
D*
LLD−
D'
LLD−
RE
G
Pre
dic
ted
All
ele
Ag
e*
LL
D−
AF
R*
Reco
mb
ina
tio
n R
ate
*
Nu
cle
oti
de
Div
ers
ity
*
Backg
rou
nd
Sele
cti
on
Sta
tis
tic
*
Cp
G−
Co
nte
nt*
b
Pre
dic
ted
All
ele
Ag
e*
LL
D−
AF
R*
Reco
mb
ina
tio
n R
ate
*
Nu
cle
oti
de
Div
ers
ity
*
Backg
rou
nd
Sele
cti
on
Sta
tis
tic
*
Cp
G−
Co
nte
nt*
c
Meta-analysis of 31 independent traits
-
17
Many LD-related annotations impact causal effect sizes
Annotation + MAF
LD−
rela
ted
an
no
tati
on
eff
ec
t siz
e (
t*)
−0.8
−0.6
−0.4
−0.2
0.0
0.2
0.4
0.6
LD−
rela
ted
an
no
tati
on
eff
ec
t siz
e (
t*)
−0.8
−0.6
−0.4
−0.2
0.0
0.2
0.4
0.6
LL
D*
LL
D−
D'*
LL
D−
RE
G*
Pre
dic
ted
All
ele
Ag
e*
LL
D−
AF
R*
Reco
mb
ina
tio
n R
ate
*
Nu
cle
oti
de
Div
ers
ity
*
Backg
rou
nd
Sele
cti
on
Sta
tis
tic
*
Cp
G−
Co
nte
nt*
a
LL
D*
LLD−
D'
LLD−
RE
G
Pre
dic
ted
All
ele
Ag
e*
LL
D−
AF
R*
Reco
mb
ina
tio
n R
ate
*
Nu
cle
oti
de
Div
ers
ity
*
Backg
rou
nd
Sele
cti
on
Sta
tis
tic
*
Cp
G−
Co
nte
nt*
b
Pre
dic
ted
All
ele
Ag
e*
LL
D−
AF
R*
Reco
mb
ina
tio
n R
ate
*
Nu
cle
oti
de
Div
ers
ity
*
Backg
rou
nd
Sele
cti
on
Sta
tis
tic
*
Cp
G−
Co
nte
nt*
c
LD−
rela
ted
an
no
tati
on
eff
ec
t siz
e (
t*)
−0.8
−0.6
−0.4
−0.2
0.0
0.2
0.4
0.6
LD−
rela
ted
an
no
tati
on
eff
ec
t siz
e (
t*)
−0.8
−0.6
−0.4
−0.2
0.0
0.2
0.4
0.6
LL
D*
LL
D−
D'*
LL
D−
RE
G*
Pre
dic
ted
All
ele
Ag
e*
LL
D−
AF
R*
Reco
mb
ina
tio
n R
ate
*
Nu
cle
oti
de
Div
ers
ity
*
Backg
rou
nd
Sele
cti
on
Sta
tis
tic
*
Cp
G−
Co
nte
nt*
a
LL
D*
LLD−
D'
LLD−
RE
G
Pre
dic
ted
All
ele
Ag
e*
LL
D−
AF
R*
Reco
mb
ina
tio
n R
ate
*
Nu
cle
oti
de
Div
ers
ity
*
Backg
rou
nd
Sele
cti
on
Sta
tis
tic
*
Cp
G−
Co
nte
nt*
b
Pre
dic
ted
All
ele
Ag
e*
LL
D−
AF
R*
Reco
mb
ina
tio
n R
ate
*
Nu
cle
oti
de
Div
ers
ity
*
Backg
rou
nd
Sele
cti
on
Sta
tis
tic
*
Cp
G−
Co
nte
nt*
c
Meta-analysis of 31 independent traits
• Recombination rate has discordant sign of effect (Hill & Robertson 1966 Genet Res)
Heritability is enriched in SNPs with low LLD in low recombination rate regions
r = −0.49
-
18
Many LD-related annotations impact causal effect sizes after conditioning on the baseline model
Annotation + MAF
LD−
rela
ted
an
no
tati
on
eff
ec
t siz
e (
t*)
−0.8
−0.6
−0.4
−0.2
0.0
0.2
0.4
0.6
LD−
rela
ted
an
no
tati
on
eff
ec
t siz
e (
t*)
−0.8
−0.6
−0.4
−0.2
0.0
0.2
0.4
0.6
LL
D*
LL
D−
D'*
LL
D−
RE
G*
Pre
dic
ted
All
ele
Ag
e*
LL
D−
AF
R*
Reco
mb
ina
tio
n R
ate
*
Nu
cle
oti
de
Div
ers
ity
*
Backg
rou
nd
Sele
cti
on
Sta
tis
tic
*
Cp
G−
Co
nte
nt*
a
LL
D*
LLD−
D'
LLD−
RE
G
Pre
dic
ted
All
ele
Ag
e*
LL
D−
AF
R*
Reco
mb
ina
tio
n R
ate
*
Nu
cle
oti
de
Div
ers
ity
*
Backg
rou
nd
Sele
cti
on
Sta
tis
tic
*
Cp
G−
Co
nte
nt*
b
Pre
dic
ted
All
ele
Ag
e*
LL
D−
AF
R*
Reco
mb
ina
tio
n R
ate
*
Nu
cle
oti
de
Div
ers
ity
*
Backg
rou
nd
Sele
cti
on
Sta
tis
tic
*
Cp
G−
Co
nte
nt*
c
LD−
rela
ted
an
no
tati
on
eff
ec
t siz
e (
t*)
−0.8
−0.6
−0.4
−0.2
0.0
0.2
0.4
0.6
LD−
rela
ted
an
no
tati
on
eff
ec
t siz
e (
t*)
−0.8
−0.6
−0.4
−0.2
0.0
0.2
0.4
0.6
LL
D*
LL
D−
D'*
LL
D−
RE
G*
Pre
dic
ted
All
ele
Ag
e*
LL
D−
AF
R*
Reco
mb
ina
tio
n R
ate
*
Nu
cle
oti
de
Div
ers
ity
*
Backg
rou
nd
Sele
cti
on
Sta
tis
tic
*
Cp
G−
Co
nte
nt*
a
LL
D*
LLD−
D'
LLD−
RE
G
Pre
dic
ted
All
ele
Ag
e*
LL
D−
AF
R*
Reco
mb
ina
tio
n R
ate
*
Nu
cle
oti
de
Div
ers
ity
*
Backg
rou
nd
Sele
cti
on
Sta
tis
tic
*
Cp
G−
Co
nte
nt*
b
Pre
dic
ted
All
ele
Ag
e*
LL
D−
AF
R*
Reco
mb
ina
tio
n R
ate
*
Nu
cle
oti
de
Div
ers
ity
*
Backg
rou
nd
Sele
cti
on
Sta
tis
tic
*
Cp
G−
Co
nte
nt*
c
LD−
rela
ted
an
no
tati
on
eff
ec
t siz
e (
t*)
−0.8
−0.6
−0.4
−0.2
0.0
0.2
0.4
0.6
LD−
rela
ted
an
no
tati
on
eff
ec
t siz
e (
t*)
−0.8
−0.6
−0.4
−0.2
0.0
0.2
0.4
0.6
LL
D*
LL
D−
D'*
LL
D−
RE
G*
Pre
dic
ted
All
ele
Ag
e*
LL
D−
AF
R*
Reco
mb
ina
tio
n R
ate
*
Nu
cle
oti
de
Div
ers
ity
*
Backg
rou
nd
Sele
cti
on
Sta
tis
tic
*
Cp
G−
Co
nte
nt*
a
LL
D*
LLD−
D'
LLD−
RE
G
Pre
dic
ted
All
ele
Ag
e*
LL
D−
AF
R*
Reco
mb
ina
tio
n R
ate
*
Nu
cle
oti
de
Div
ers
ity
*
Backg
rou
nd
Sele
cti
on
Sta
tis
tic
*
Cp
G−
Co
nte
nt*
b
Pre
dic
ted
All
ele
Ag
e*
LL
D−
AF
R*
Reco
mb
ina
tio
n R
ate
*
Nu
cle
oti
de
Div
ers
ity
*
Backg
rou
nd
Sele
cti
on
Sta
tis
tic
*
Cp
G−
Co
nte
nt*
cAnnotation + MAF + baseline model
Meta-analysis of 31 independent traits
-
19
Many LD-related annotations impact causal effect sizes after conditioning on the baseline model
Annotation + MAF
LD−
rela
ted
an
no
tati
on
eff
ec
t siz
e (
t*)
−0.8
−0.6
−0.4
−0.2
0.0
0.2
0.4
0.6
LD−
rela
ted
an
no
tati
on
eff
ec
t siz
e (
t*)
−0.8
−0.6
−0.4
−0.2
0.0
0.2
0.4
0.6
LL
D*
LL
D−
D'*
LL
D−
RE
G*
Pre
dic
ted
All
ele
Ag
e*
LL
D−
AF
R*
Reco
mb
ina
tio
n R
ate
*
Nu
cle
oti
de
Div
ers
ity
*
Backg
rou
nd
Sele
cti
on
Sta
tis
tic
*
Cp
G−
Co
nte
nt*
a
LL
D*
LLD−
D'
LLD−
RE
G
Pre
dic
ted
All
ele
Ag
e*
LL
D−
AF
R*
Reco
mb
ina
tio
n R
ate
*
Nu
cle
oti
de
Div
ers
ity
*
Backg
rou
nd
Sele
cti
on
Sta
tis
tic
*
Cp
G−
Co
nte
nt*
b
Pre
dic
ted
All
ele
Ag
e*
LL
D−
AF
R*
Reco
mb
ina
tio
n R
ate
*
Nu
cle
oti
de
Div
ers
ity
*
Backg
rou
nd
Sele
cti
on
Sta
tis
tic
*
Cp
G−
Co
nte
nt*
c
LD−
rela
ted
an
no
tati
on
eff
ec
t siz
e (
t*)
−0.8
−0.6
−0.4
−0.2
0.0
0.2
0.4
0.6
LD−
rela
ted
an
no
tati
on
eff
ec
t siz
e (
t*)
−0.8
−0.6
−0.4
−0.2
0.0
0.2
0.4
0.6
LL
D*
LL
D−
D'*
LL
D−
RE
G*
Pre
dic
ted
All
ele
Ag
e*
LL
D−
AF
R*
Reco
mb
ina
tio
n R
ate
*
Nu
cle
oti
de
Div
ers
ity
*
Backg
rou
nd
Sele
cti
on
Sta
tis
tic
*
Cp
G−
Co
nte
nt*
a
LL
D*
LLD−
D'
LLD−
RE
G
Pre
dic
ted
All
ele
Ag
e*
LL
D−
AF
R*
Reco
mb
ina
tio
n R
ate
*
Nu
cle
oti
de
Div
ers
ity
*
Backg
rou
nd
Sele
cti
on
Sta
tis
tic
*
Cp
G−
Co
nte
nt*
b
Pre
dic
ted
All
ele
Ag
e*
LL
D−
AF
R*
Reco
mb
ina
tio
n R
ate
*
Nu
cle
oti
de
Div
ers
ity
*
Backg
rou
nd
Sele
cti
on
Sta
tis
tic
*
Cp
G−
Co
nte
nt*
c
LD−
rela
ted
an
no
tati
on
eff
ec
t siz
e (
t*)
−0.8
−0.6
−0.4
−0.2
0.0
0.2
0.4
0.6
LD−
rela
ted
an
no
tati
on
eff
ec
t siz
e (
t*)
−0.8
−0.6
−0.4
−0.2
0.0
0.2
0.4
0.6
LL
D*
LL
D−
D'*
LL
D−
RE
G*
Pre
dic
ted
All
ele
Ag
e*
LL
D−
AF
R*
Reco
mb
ina
tio
n R
ate
*
Nu
cle
oti
de
Div
ers
ity
*
Backg
rou
nd
Sele
cti
on
Sta
tis
tic
*
Cp
G−
Co
nte
nt*
a
LL
D*
LLD−
D'
LLD−
RE
G
Pre
dic
ted
All
ele
Ag
e*
LL
D−
AF
R*
Reco
mb
ina
tio
n R
ate
*
Nu
cle
oti
de
Div
ers
ity
*
Backg
rou
nd
Sele
cti
on
Sta
tis
tic
*
Cp
G−
Co
nte
nt*
b
Pre
dic
ted
All
ele
Ag
e*
LL
D−
AF
R*
Reco
mb
ina
tio
n R
ate
*
Nu
cle
oti
de
Div
ers
ity
*
Backg
rou
nd
Sele
cti
on
Sta
tis
tic
*
Cp
G−
Co
nte
nt*
cAnnotation + MAF + baseline model
Meta-analysis of 31 independent traits
• LLD effect is 0.37x smaller when including annotations from baseline model
Some, but not all, of LD-dependent architecture due to DHS, enhancers, etc.
0.37x
-
20
Many LD-related annotations impact causal effect sizes after conditioning on the baseline model
Annotation + MAF
LD−
rela
ted
an
no
tati
on
eff
ec
t siz
e (
t*)
−0.8
−0.6
−0.4
−0.2
0.0
0.2
0.4
0.6
LD−
rela
ted
an
no
tati
on
eff
ec
t siz
e (
t*)
−0.8
−0.6
−0.4
−0.2
0.0
0.2
0.4
0.6
LL
D*
LL
D−
D'*
LL
D−
RE
G*
Pre
dic
ted
All
ele
Ag
e*
LL
D−
AF
R*
Reco
mb
ina
tio
n R
ate
*
Nu
cle
oti
de
Div
ers
ity
*
Backg
rou
nd
Sele
cti
on
Sta
tis
tic
*
Cp
G−
Co
nte
nt*
a
LL
D*
LLD−
D'
LLD−
RE
G
Pre
dic
ted
All
ele
Ag
e*
LL
D−
AF
R*
Reco
mb
ina
tio
n R
ate
*
Nu
cle
oti
de
Div
ers
ity
*
Backg
rou
nd
Sele
cti
on
Sta
tis
tic
*
Cp
G−
Co
nte
nt*
b
Pre
dic
ted
All
ele
Ag
e*
LL
D−
AF
R*
Reco
mb
ina
tio
n R
ate
*
Nu
cle
oti
de
Div
ers
ity
*
Backg
rou
nd
Sele
cti
on
Sta
tis
tic
*
Cp
G−
Co
nte
nt*
c
LD−
rela
ted
an
no
tati
on
eff
ec
t siz
e (
t*)
−0.8
−0.6
−0.4
−0.2
0.0
0.2
0.4
0.6
LD−
rela
ted
an
no
tati
on
eff
ec
t siz
e (
t*)
−0.8
−0.6
−0.4
−0.2
0.0
0.2
0.4
0.6
LL
D*
LL
D−
D'*
LL
D−
RE
G*
Pre
dic
ted
All
ele
Ag
e*
LL
D−
AF
R*
Reco
mb
ina
tio
n R
ate
*
Nu
cle
oti
de
Div
ers
ity
*
Backg
rou
nd
Sele
cti
on
Sta
tis
tic
*
Cp
G−
Co
nte
nt*
a
LL
D*
LLD−
D'
LLD−
RE
G
Pre
dic
ted
All
ele
Ag
e*
LL
D−
AF
R*
Reco
mb
ina
tio
n R
ate
*
Nu
cle
oti
de
Div
ers
ity
*
Backg
rou
nd
Sele
cti
on
Sta
tis
tic
*
Cp
G−
Co
nte
nt*
b
Pre
dic
ted
All
ele
Ag
e*
LL
D−
AF
R*
Reco
mb
ina
tio
n R
ate
*
Nu
cle
oti
de
Div
ers
ity
*
Backg
rou
nd
Sele
cti
on
Sta
tis
tic
*
Cp
G−
Co
nte
nt*
c
LD−
rela
ted
an
no
tati
on
eff
ec
t siz
e (
t*)
−0.8
−0.6
−0.4
−0.2
0.0
0.2
0.4
0.6
LD−
rela
ted
an
no
tati
on
eff
ec
t siz
e (
t*)
−0.8
−0.6
−0.4
−0.2
0.0
0.2
0.4
0.6
LL
D*
LL
D−
D'*
LL
D−
RE
G*
Pre
dic
ted
All
ele
Ag
e*
LL
D−
AF
R*
Reco
mb
ina
tio
n R
ate
*
Nu
cle
oti
de
Div
ers
ity
*
Backg
rou
nd
Sele
cti
on
Sta
tis
tic
*
Cp
G−
Co
nte
nt*
a
LL
D*
LLD−
D'
LLD−
RE
G
Pre
dic
ted
All
ele
Ag
e*
LL
D−
AF
R*
Reco
mb
ina
tio
n R
ate
*
Nu
cle
oti
de
Div
ers
ity
*
Backg
rou
nd
Sele
cti
on
Sta
tis
tic
*
Cp
G−
Co
nte
nt*
b
Pre
dic
ted
All
ele
Ag
e*
LL
D−
AF
R*
Reco
mb
ina
tio
n R
ate
*
Nu
cle
oti
de
Div
ers
ity
*
Backg
rou
nd
Sele
cti
on
Sta
tis
tic
*
Cp
G−
Co
nte
nt*
cAnnotation + MAF + baseline model
Meta-analysis of 31 independent traits
Predicted allele age has largest effect. SNPs with smaller (more recent) allele age have larger causal effect sizes Negative effects across 55/56 traits
-
21
Many LD-related annotations impact causal effect sizes in joint fit after conditioning on baseline model
Annotation + MAF
LD−
rela
ted
an
no
tati
on
eff
ec
t siz
e (
t*)
−0.8
−0.6
−0.4
−0.2
0.0
0.2
0.4
0.6
LD−
rela
ted
an
no
tati
on
eff
ec
t siz
e (
t*)
−0.8
−0.6
−0.4
−0.2
0.0
0.2
0.4
0.6
LL
D*
LL
D−
D'*
LL
D−
RE
G*
Pre
dic
ted
All
ele
Ag
e*
LL
D−
AF
R*
Reco
mb
ina
tio
n R
ate
*
Nu
cle
oti
de
Div
ers
ity
*
Backg
rou
nd
Sele
cti
on
Sta
tis
tic
*
Cp
G−
Co
nte
nt*
a
LL
D*
LLD−
D'
LLD−
RE
G
Pre
dic
ted
All
ele
Ag
e*
LL
D−
AF
R*
Reco
mb
ina
tio
n R
ate
*
Nu
cle
oti
de
Div
ers
ity
*
Backg
rou
nd
Sele
cti
on
Sta
tis
tic
*
Cp
G−
Co
nte
nt*
b
Pre
dic
ted
All
ele
Ag
e*
LL
D−
AF
R*
Reco
mb
ina
tio
n R
ate
*
Nu
cle
oti
de
Div
ers
ity
*
Backg
rou
nd
Sele
cti
on
Sta
tis
tic
*
Cp
G−
Co
nte
nt*
c
LD−
rela
ted
an
no
tati
on
eff
ec
t siz
e (
t*)
−0.8
−0.6
−0.4
−0.2
0.0
0.2
0.4
0.6
LD−
rela
ted
an
no
tati
on
eff
ec
t siz
e (
t*)
−0.8
−0.6
−0.4
−0.2
0.0
0.2
0.4
0.6
LL
D*
LL
D−
D'*
LL
D−
RE
G*
Pre
dic
ted
All
ele
Ag
e*
LL
D−
AF
R*
Reco
mb
ina
tio
n R
ate
*
Nu
cle
oti
de
Div
ers
ity
*
Backg
rou
nd
Sele
cti
on
Sta
tis
tic
*
Cp
G−
Co
nte
nt*
a
LL
D*
LLD−
D'
LLD−
RE
G
Pre
dic
ted
All
ele
Ag
e*
LL
D−
AF
R*
Reco
mb
ina
tio
n R
ate
*
Nu
cle
oti
de
Div
ers
ity
*
Backg
rou
nd
Sele
cti
on
Sta
tis
tic
*
Cp
G−
Co
nte
nt*
b
Pre
dic
ted
All
ele
Ag
e*
LL
D−
AF
R*
Reco
mb
ina
tio
n R
ate
*
Nu
cle
oti
de
Div
ers
ity
*
Backg
rou
nd
Sele
cti
on
Sta
tis
tic
*
Cp
G−
Co
nte
nt*
c
LD−
rela
ted
an
no
tati
on
eff
ec
t siz
e (
t*)
−0.8
−0.6
−0.4
−0.2
0.0
0.2
0.4
0.6
LD−
rela
ted
an
no
tati
on
eff
ec
t siz
e (
t*)
−0.8
−0.6
−0.4
−0.2
0.0
0.2
0.4
0.6
LL
D*
LL
D−
D'*
LL
D−
RE
G*
Pre
dic
ted
All
ele
Ag
e*
LL
D−
AF
R*
Reco
mb
ina
tio
n R
ate
*
Nu
cle
oti
de
Div
ers
ity
*
Backg
rou
nd
Sele
cti
on
Sta
tis
tic
*
Cp
G−
Co
nte
nt*
a
LL
D*
LLD−
D'
LLD−
RE
G
Pre
dic
ted
All
ele
Ag
e*
LL
D−
AF
R*
Reco
mb
ina
tio
n R
ate
*
Nu
cle
oti
de
Div
ers
ity
*
Backg
rou
nd
Sele
cti
on
Sta
tis
tic
*
Cp
G−
Co
nte
nt*
b
Pre
dic
ted
All
ele
Ag
e*
LL
D−
AF
R*
Reco
mb
ina
tio
n R
ate
*
Nu
cle
oti
de
Div
ers
ity
*
Backg
rou
nd
Sele
cti
on
Sta
tis
tic
*
Cp
G−
Co
nte
nt*
cAnnotations jointly + MAF + baseline model
Annotation + MAF + baseline model
Meta-analysis of 31 independent traits
-
22
Many LD-related annotations impact causal effect sizes in joint fit after conditioning on baseline model
Annotation + MAF
LD−
rela
ted
an
no
tati
on
eff
ec
t siz
e (
t*)
−0.8
−0.6
−0.4
−0.2
0.0
0.2
0.4
0.6
LD−
rela
ted
an
no
tati
on
eff
ec
t siz
e (
t*)
−0.8
−0.6
−0.4
−0.2
0.0
0.2
0.4
0.6
LL
D*
LL
D−
D'*
LL
D−
RE
G*
Pre
dic
ted
All
ele
Ag
e*
LL
D−
AF
R*
Reco
mb
ina
tio
n R
ate
*
Nu
cle
oti
de
Div
ers
ity
*
Backg
rou
nd
Sele
cti
on
Sta
tis
tic
*
Cp
G−
Co
nte
nt*
a
LL
D*
LLD−
D'
LLD−
RE
G
Pre
dic
ted
All
ele
Ag
e*
LL
D−
AF
R*
Reco
mb
ina
tio
n R
ate
*
Nu
cle
oti
de
Div
ers
ity
*
Backg
rou
nd
Sele
cti
on
Sta
tis
tic
*
Cp
G−
Co
nte
nt*
b
Pre
dic
ted
All
ele
Ag
e*
LL
D−
AF
R*
Reco
mb
ina
tio
n R
ate
*
Nu
cle
oti
de
Div
ers
ity
*
Backg
rou
nd
Sele
cti
on
Sta
tis
tic
*
Cp
G−
Co
nte
nt*
c
LD−
rela
ted
an
no
tati
on
eff
ec
t siz
e (
t*)
−0.8
−0.6
−0.4
−0.2
0.0
0.2
0.4
0.6
LD−
rela
ted
an
no
tati
on
eff
ec
t siz
e (
t*)
−0.8
−0.6
−0.4
−0.2
0.0
0.2
0.4
0.6
LL
D*
LL
D−
D'*
LL
D−
RE
G*
Pre
dic
ted
All
ele
Ag
e*
LL
D−
AF
R*
Reco
mb
ina
tio
n R
ate
*
Nu
cle
oti
de
Div
ers
ity
*
Backg
rou
nd
Sele
cti
on
Sta
tis
tic
*
Cp
G−
Co
nte
nt*
a
LL
D*
LLD−
D'
LLD−
RE
G
Pre
dic
ted
All
ele
Ag
e*
LL
D−
AF
R*
Reco
mb
ina
tio
n R
ate
*
Nu
cle
oti
de
Div
ers
ity
*
Backg
rou
nd
Sele
cti
on
Sta
tis
tic
*
Cp
G−
Co
nte
nt*
b
Pre
dic
ted
All
ele
Ag
e*
LL
D−
AF
R*
Reco
mb
ina
tio
n R
ate
*
Nu
cle
oti
de
Div
ers
ity
*
Backg
rou
nd
Sele
cti
on
Sta
tis
tic
*
Cp
G−
Co
nte
nt*
c
LD−
rela
ted
an
no
tati
on
eff
ec
t siz
e (
t*)
−0.8
−0.6
−0.4
−0.2
0.0
0.2
0.4
0.6
LD−
rela
ted
an
no
tati
on
eff
ec
t siz
e (
t*)
−0.8
−0.6
−0.4
−0.2
0.0
0.2
0.4
0.6
LL
D*
LL
D−
D'*
LL
D−
RE
G*
Pre
dic
ted
All
ele
Ag
e*
LL
D−
AF
R*
Reco
mb
ina
tio
n R
ate
*
Nu
cle
oti
de
Div
ers
ity
*
Backg
rou
nd
Sele
cti
on
Sta
tis
tic
*
Cp
G−
Co
nte
nt*
a
LL
D*
LLD−
D'
LLD−
RE
G
Pre
dic
ted
All
ele
Ag
e*
LL
D−
AF
R*
Reco
mb
ina
tio
n R
ate
*
Nu
cle
oti
de
Div
ers
ity
*
Backg
rou
nd
Sele
cti
on
Sta
tis
tic
*
Cp
G−
Co
nte
nt*
b
Pre
dic
ted
All
ele
Ag
e*
LL
D−
AF
R*
Reco
mb
ina
tio
n R
ate
*
Nu
cle
oti
de
Div
ers
ity
*
Backg
rou
nd
Sele
cti
on
Sta
tis
tic
*
Cp
G−
Co
nte
nt*
cAnnotations jointly + MAF + baseline model
Annotation + MAF + baseline model
Meta-analysis of 31 independent traits
6 significant annotations in joint fit
-
23
40%
Quintiles illustrate large effects of LD-related annotations
30%
20%
10%
0%
Pro
po
rtio
n o
f h
eri
tab
ility
Predicted
Allele Age
LLD−AFR Recombination
Rate
Nucleotide
Diversity
Background
SelectionStatistic
CpG−Content MAF
Pro
po
rtio
n o
f H
eri
tab
ilit
y
0.0
0.1
0.2
0.3
0.4
Predicted
Allele Age
LLD−AFR Recombination
Rate
Nucleotide
Diversity
Background
SelectionStatistic
CpG−Content MAF
Pro
po
rtio
n o
f H
eri
tab
ilit
y
0.0
0.1
0.2
0.3
0.4 1st quintile (lowest values)
2nd quintile
3rd quintile
4th quintile
5th quintile (highest values)
-
24
Youngest 20% explain 3.8x more heritability oldest 20%
1.8x
Predicted allele age is even more informative than MAF
40%
30%
20%
10%
0%
Pro
po
rtio
n o
f h
eri
tab
ility
Predicted
Allele Age
LLD−AFR Recombination
Rate
Nucleotide
Diversity
Background
SelectionStatistic
CpG−Content MAF
Pro
po
rtio
n o
f H
eri
tab
ilit
y
0.0
0.1
0.2
0.3
0.4
Predicted
Allele Age
LLD−AFR Recombination
Rate
Nucleotide
Diversity
Background
SelectionStatistic
CpG−Content MAF
Pro
po
rtio
n o
f H
eri
tab
ilit
y
0.0
0.1
0.2
0.3
0.4 1st quintile (lowest values)
2nd quintile
3rd quintile
4th quintile
5th quintile (highest values)
-
25
No overall effect for recombination rate
Competing effects of LLD-AFR and RR imply no overall effect for RR
40%
30%
20%
10%
0%
Pro
po
rtio
n o
f h
eri
tab
ility
Predicted
Allele Age
LLD−AFR Recombination
Rate
Nucleotide
Diversity
Background
SelectionStatistic
CpG−Content MAF
Pro
po
rtio
n o
f H
eri
tab
ilit
y
0.0
0.1
0.2
0.3
0.4
Predicted
Allele Age
LLD−AFR Recombination
Rate
Nucleotide
Diversity
Background
SelectionStatistic
CpG−Content MAF
Pro
po
rtio
n o
f H
eri
tab
ilit
y
0.0
0.1
0.2
0.3
0.4 1st quintile (lowest values)
2nd quintile
3rd quintile
4th quintile
5th quintile (highest values)
-
26
LD-related annotations tag background selection
• Predicted allele age: deleterious variants are younger
• LLD-AFR: information on variant history?
• Recombination rate: low recombination rate => less efficient selection (Hill & Robertson 1966 Genetics)
• Nucleotide diversity: regions under selection have lower diversity
• Background selection statistic: regions under selection are biologically important
• CpG-Content: mutation rate? unknown functional elements?
LD−
rela
ted
an
no
tati
on
eff
ec
t siz
e (
t*)
−0.8
−0.6
−0.4
−0.2
0.0
0.2
0.4
0.6
LD−
rela
ted
an
no
tati
on
eff
ec
t siz
e (
t*)
−0.8
−0.6
−0.4
−0.2
0.0
0.2
0.4
0.6
LL
D*
LL
D−
D'*
LL
D−
RE
G*
Pre
dic
ted
All
ele
Ag
e*
LL
D−
AF
R*
Reco
mb
ina
tio
n R
ate
*
Nu
cle
oti
de
Div
ers
ity
*
Backg
rou
nd
Sele
cti
on
Sta
tis
tic
*
Cp
G−
Co
nte
nt*
a
LL
D*
LLD−
D'
LLD−
RE
G
Pre
dic
ted
All
ele
Ag
e*
LL
D−
AF
R*
Reco
mb
ina
tio
n R
ate
*
Nu
cle
oti
de
Div
ers
ity
*
Backg
rou
nd
Sele
cti
on
Sta
tis
tic
*
Cp
G−
Co
nte
nt*
b
Pre
dic
ted
All
ele
Ag
e*
LL
D−
AF
R*
Reco
mb
ina
tio
n R
ate
*
Nu
cle
oti
de
Div
ers
ity
*
Backg
rou
nd
Sele
cti
on
Sta
tis
tic
*
Cp
G−
Co
nte
nt*
cL
D−
rela
ted
an
no
tati
on
eff
ec
t siz
e (
t*)
−0.8
−0.6
−0.4
−0.2
0.0
0.2
0.4
0.6L
D−
rela
ted
an
no
tati
on
eff
ec
t siz
e (
t*)
−0.8
−0.6
−0.4
−0.2
0.0
0.2
0.4
0.6
LL
D*
LL
D−
D'*
LL
D−
RE
G*
Pre
dic
ted
All
ele
Ag
e*
LL
D−
AF
R*
Reco
mb
ina
tio
n R
ate
*
Nu
cle
oti
de
Div
ers
ity
*
Backg
rou
nd
Sele
cti
on
Sta
tis
tic
*
Cp
G−
Co
nte
nt*
a
LL
D*
LLD−
D'
LLD−
RE
G
Pre
dic
ted
All
ele
Ag
e*
LL
D−
AF
R*
Reco
mb
ina
tio
n R
ate
*
Nu
cle
oti
de
Div
ers
ity
*
Backg
rou
nd
Sele
cti
on
Sta
tis
tic
*
Cp
G−
Co
nte
nt*
b
Pre
dic
ted
All
ele
Ag
e*
LL
D−
AF
R*
Reco
mb
ina
tio
n R
ate
*
Nu
cle
oti
de
Div
ers
ity
*
Backg
rou
nd
Sele
cti
on
Sta
tis
tic
*
Cp
G−
Co
nte
nt*
c
31 traits
Annotations jointly + MAF
+ baseline model
-
27
Forward simulations confirm impact of many LD-related annotations on selection coefficient s
• Forward simulations using SLiM (Messer 2013 Genetics) under African-European demographic model (Gravel et al. 2011 PNAS)
• Recombination rate and % of deleterious SNPs vary across regions, selection coeff s varies across deleterious SNPs • Jointly regress selection coeff s on 4 LD-related annotations and minor allele frequency
LD−
rela
ted
an
no
tati
on
eff
ec
t siz
e (
t*)
−0.8
−0.6
−0.4
−0.2
0.0
0.2
0.4
0.6
LD−
rela
ted
an
no
tati
on
eff
ec
t siz
e (
t*)
−0.8
−0.6
−0.4
−0.2
0.0
0.2
0.4
0.6
LL
D*
LL
D−
D'*
LL
D−
RE
G*
Pre
dic
ted
All
ele
Ag
e*
LL
D−
AF
R*
Reco
mb
ina
tio
n R
ate
*
Nu
cle
oti
de
Div
ers
ity
*
Backg
rou
nd
Sele
cti
on
Sta
tis
tic
*
Cp
G−
Co
nte
nt*
a
LL
D*
LLD−
D'
LLD−
RE
G
Pre
dic
ted
All
ele
Ag
e*
LL
D−
AF
R*
Reco
mb
ina
tio
n R
ate
*
Nu
cle
oti
de
Div
ers
ity
*
Backg
rou
nd
Sele
cti
on
Sta
tis
tic
*
Cp
G−
Co
nte
nt*
b
Pre
dic
ted
All
ele
Ag
e*
LL
D−
AF
R*
Reco
mb
ina
tio
n R
ate
*
Nu
cle
oti
de
Div
ers
ity
*
Backg
rou
nd
Sele
cti
on
Sta
tis
tic
*
Cp
G−
Co
nte
nt*
cL
D−
rela
ted
an
no
tati
on
eff
ec
t siz
e (
t*)
−0.8
−0.6
−0.4
−0.2
0.0
0.2
0.4
0.6L
D−
rela
ted
an
no
tati
on
eff
ec
t siz
e (
t*)
−0.8
−0.6
−0.4
−0.2
0.0
0.2
0.4
0.6
LL
D*
LL
D−
D'*
LL
D−
RE
G*
Pre
dic
ted
All
ele
Ag
e*
LL
D−
AF
R*
Reco
mb
ina
tio
n R
ate
*
Nu
cle
oti
de
Div
ers
ity
*
Backg
rou
nd
Sele
cti
on
Sta
tis
tic
*
Cp
G−
Co
nte
nt*
a
LL
D*
LLD−
D'
LLD−
RE
G
Pre
dic
ted
All
ele
Ag
e*
LL
D−
AF
R*
Reco
mb
ina
tio
n R
ate
*
Nu
cle
oti
de
Div
ers
ity
*
Backg
rou
nd
Sele
cti
on
Sta
tis
tic
*
Cp
G−
Co
nte
nt*
b
Pre
dic
ted
All
ele
Ag
e*
LL
D−
AF
R*
Reco
mb
ina
tio
n R
ate
*
Nu
cle
oti
de
Div
ers
ity
*
Backg
rou
nd
Sele
cti
on
Sta
tis
tic
*
Cp
G−
Co
nte
nt*
cAnnotations jointly + MAF
+ baseline model
X
X Tru
e A
llele
Ag
e
LL
D−
AF
R
Reco
mb
ina
tio
n R
ate
Nu
cle
oti
de D
ivers
ity
Sta
nd
ard
ized
re
gre
ssio
n c
oeff
icie
nt
−8
−6
−4
−2
0
2
4
6
Tru
e A
llele
Ag
e
LL
D−
AF
R
Reco
mb
ina
tio
n R
ate
Nu
cle
oti
de D
ivers
ity
Sta
nd
ard
ized
re
gre
ssio
n c
oeff
icie
nt
−8
−6
−4
−2
0
2
4
6
31 traits
Forward simulations: Impact on s
Simulations
-
1. SNPs with lower LD have larger causal effect sizes (after conditioning on MAF) across all traits analyzed.
2. About half of this effect can be explained by known functional annotations (e.g. DHS, enhancers) with lower LD.
3. The remainder of the effect is explained by multiple LD-related annotations, including predicted allele age.
4. Forward simulations confirm that all of these findings are consistent with the action of negative selection.
28
Conclusions
-
29
Acknowledgments
Harvard School of Public Health: • Alkes Price • Hilary Finucane • Po-Ru Loh • Pier Palamara • Xuanyao Liu • Armin Schoech • Sasha Gusev 23andMe, Inc. • Nick Furlotte • Research participants of 23andMe
MGH/Broad Institute: • Brendan Bulik-Sullivan • Ben Neale UK Biobank
-
30
-
• Publicly available summary statistics (29) Age at Menarche, Age at Menopause, Anorexia, Autism, Bipolar, BMI, Celiac, Coronary Artery Disease, Crohn's Disease, Depressive Symptoms, Ever Smoked, Fasting Glucose, HbA1C, HDL, Height, IBD, LDL, Neuroticism, Cirrhosis, Putamen Volume, Rheumatoid Arthritis, Schizophrenia, Subject well being, Lupus, Triglycerides, Type 2 Diabetes, Ulcerative Colitis, Years of Education
• 23andMe summary statistics (19) Age at Menarche, Age voice deepened, Black hair, Chin dimple, Cup Size, Dimples, Facial stubble, Female hair loss, Hair color, Hair curl, Height, Hypermobility, IQB gray amount, Male pattern, Motion sick, Nose size, Shoe size, Unibrow, Widows peak
• UK Biobank data (15) Age at Menarche, Age at Menopause, Asthma, Blood pressure (Diastolic), Blood pressure (Systolic), BMI, College Education, Eczema, Heel T-Score, Height, Hypertension, Lung FEV1/FVC ratio, Lung Forced Expiratory Volume, Smoking Status, Waist–hip ratio (BMI adjusted)
56 traits in total (average N = 101,420) Meta-analysis over 31 independent traits (average N = 84,719) 31
56 traits analyzed (summary statistics only)
-
32
SNPs with smaller (more recent) allele age have larger causal effect sizes
Allele
ag
e e
ffe
ct
siz
e (
t*)
−1.0
−0.8
−0.6
−0.4
−0.2
0.0A
ge a
t M
enarc
he
An
ore
xia
Autism
Spectr
um
Blo
od P
ressure
(D
iasto
lic)
BM
I
Celia
c D
isease
Cro
hn's
Dis
ease
Cup
Siz
e
Hair C
url
He
el T−
Score
Heig
ht
IQB
Gra
y A
moun
t
Lung F
EV
1/F
VC
ratio
Lung
FV
C
Male
Pattern
Pri
mary
Bili
ary
Cirrh
osis
Rheu
mato
id A
rthri
tis
Sch
izoph
ren
ia
Shoe
Siz
e
Unib
row
Allele
ag
e e
ffe
ct
siz
e (
t*)
Meta-analysis: P = 1.60 x 10-106
Impact of predicted allele age on genetic architecture for 20 highly heritable traits
Negative effects across 55/56 traits
-
• Can we infer different distributions of selection coefficients s across different classes?
• Do different traits have significantly different values of selection coefficient s?
33
Futur directions