Post on 12-Jan-2016
description
Cambridge, July 16, 2010
Tomi Pastinen, MD, PhDAssistant ProfessorDepartments of Human and Medical Genetics, McGill UniversityMcGill University and Genome Quebec Innovation Centre
1. Allelic expression: principle and methodology
2. Catalogs of cis-regulatory SNPs (cis-rSNPs)
3. Applications
AsthmaAsthmaType 1 Type 1
DiabetesDiabetes
Non-codingvariant
Non-codingvariant
Codingvariant
Codingvariant
C
TTC
C
T
mRNA (or pre-mRNA)
relative allele ratios can be used as quantitative trait for mapping local cis-regulatory variation in phased samples
Expectedequal expression of allelic transcripts
CT
11
=
T
T
Observedbiased allelic expression
CT
12
=
Illumina Human 1M Duo(currently 2.5 M quad)
C
TTC
C
T
TT gDNA AE
ratio (T/C)=1
cDNA AEratio (T/C)=2
Allelic Expression (AE) Measurement
Population panel of cells(CEU & YRI LCLs from
HapMap)
AB
AB
AA
AE mapping in phased chromosomes
AE association +vecis-regulatory SNP (cis-rSNP)
AE association -ve
Variability of allele ratio [ = Y/(Y+X)] needs to be accounted for when comparing cDNA to gDNA
cDNA cDNA
gDNAgDNA
R2 = 0.72in biologicalreplicates(CEU LCL)
R2 = 0.83in technicalreplicates(CEU LCL)
R2 = 0.76in biologicalreplicates in same environment(osteoblast)
R2 = 0.61in individualbut in differentcell culturecondition(osteoblast)
We use heterozygote ratio difference in phased gDNA and cDNA genotype data
het ratio = RNAc1/(c1+c2) - DNAc1/(c1+c2)
• single point allelic expression is noisy• heterozygosity low using coding SNPs only
P = 2x10-9
C1C2 genotypeBA BB/AA AB
phased RNA (cDNA) and
genomic DNA (gDNA)
genotyping data from same
individual are averaged across multiple sites in
primary transcripts
phased RNA (cDNA) and
genomic DNA (gDNA)
genotyping data from same
individual are averaged across multiple sites in
primary transcripts
Full Transcript (AFF3 = ~600Kb), 5’ Association
AE measurements across large genes
AB
AB
Observation
AERati
o
Monoallelic A 0.5
Equal Expression of A and B
0
Monoallelic B -0.5
Differential 5’ Exon Usage (CUGBP2), 5’ Association
Allele-specific expression of long isoform
• on average > 50% of population variance in cis-regulation can be explained by common SNPs in associated loci
• 5-10x more fxn variation revealed as compared to cis-eQTL mapping
• >90% of mapped cis-rSNPs behave as expected in the offspring (Mendelian inheritance)
observe large effect sizes for associated variants
common cis-variants affect >30% of measured RefSeq transcripts
low-throughput methods show converging data for 75% of genome-wide significant AE mapping results, but diversity of mechanisms suggested
AE association by RefSeq annotation
RefSeq (n)
Fraction of
Measured
RefSeq
Non-RefSeqOverlap
(n)
Fraction of
MeasuredNon-
RefSeq
Mean r2
GW Significant(P<7.6x10-9)
1360 14% 815 8% 0.74
Permutation 0.001
2935 30% 2225 21% 0.63
Permutation 0.005
3408 35% 2787 26% 0.59
Ge et al., Nature Genet. 2009
1) Large effect size (> 1.2-fold difference between cis-rSNP heterozygotes) across fulllength transcripts2) Most SNPs (>75%) of all available SNPs in primary transcript above signal cut-off3) Consistent allelic effects across introns and exons of the primary transcript (for transcripts fulfilling criteria 1+2, the proportion with exon – intron r2 > 0.3 is >90%)
~17%of genes
of top cis-eQTLs up to 50% of AE-mapping data show converging cis-rSNP; but given the high discovery rate only ~10% of cis-rSNPs yield significant cis-eQTL (Ge et al. 2009)
But comparison of AE mapping data in YRI LCLs vs. YRI RNA-seq. data shows converging effects for vast
majority of transcriptional cis-rSNPs
-log10(P-value)
6.8
6.3
CEU
YRI
CEU+YRI
-log10(P-value)
-log10(P-value)
11
Fine-map region of shared association to look for causal cis-rSNPs
CEU SNP YRI SNP
14
12
10
8
6
4
96
94
92
90
88
86
84
82-5 -4 -3 -2 -1 0
1000G Score Cutoff
simple scoring based on deviation from expected heterozygosity among samples showing unequal/equal AE
Pperm < 0.001 in FB Pperm > 0.001 in FBOverlapping transcription altering cis-rSNP
5’ proximal cis-rSNPsaltering regulationof DISC1 in a cell typeindependent manner
Most common type ofcis-rSNPs
5’ distal cis-rSNPsaltering regulationof PTGER4 in a cell typedependent manner
3rd most common typeof cis-rSNPs
3’ distal cis-rSNPsaltering regulationof EFNA5 in a cell typedependent manner
least common typeof cis-rSNPs
5’ distal cis-rSNPsaltering regulationof EFNA5 in a cell typedependent manner
P-value (2-tailed)Association orientation 5' 3' 5' 3'Maximum Fine-mapped SNPs/Transcript 1 1 3 3
BroadChipSeqPeaksGm12878Ctcf9.33E-
052.11E-
043.16E-06
9.21E-08
BroadChipSeqPeaksGm12878H3k4me11.01E-
439.57E-
259.59E-53
2.32E-24
BroadChipSeqPeaksGm12878H3k4me22.14E-
211.91E-
096.79E-22
7.88E-15
BroadChipSeqPeaksGm12878H3k4me33.53E-
161.14E-
066.24E-09
2.12E-10
DukeDNaseSeqPeaksGm12878V32.50E-
041.96E-
034.19E-12
2.86E-04
UncFAIREseqPeaksGm12878V31.30E-
793.16E-
464.87E-
1126.88E-
70
UtaChIPseqPeaksGm12878CtcfV39.03E-
071.98E-01 1.47E-03
9.17E-03
UwChIPSeqHotspotsRep1Gm06990Ctcf5.33E-
091.13E-
078.33E-06
1.30E-10
UwChIPSeqHotspotsRep1Gm12801Ctcf2.85E-
021.81E-
048.36E-04
5.60E-08
UwChIPSeqPeaksGm12801Ctcf3.09E-
041.27E-
021.62E-07
6.59E-06
UwDnaseSeqHotspotsRep1Gm128784.29E-
094.40E-
051.91E-05
1.10E-06
UwDnaseSeqHotspotsRep2Gm069902.93E-
052.14E-
059.56E-04
1.69E-06
UwDnaseSeqHotspotsRep2Gm128789.63E-
082.22E-
026.98E-04
1.02E-03
UwDnaseSeqPeaksRep2Gm128788.58E-
093.92E-01 4.27E-08 8.22E-02
YaleChIPseqPeaksGm12878MaxV23.41E-
062.21E-01 3.13E-03 4.09E-01
YaleChIPseqPeaksGm12878Pol2V28.60E-
092.41E-02 7.71E-05 1.72E-01
In vitro validation of intronic enhancer rSNP (rs909685)
In vitro validation of promoter rSNP (rs344071)
Allele-specific DNA-protein interactions
Input
FAIRE
MNase
rs17658686CG
C*-
com
peti
tor
C*-
nu
clea e
xtr
act
ion
C*+
C c
om
peti
tor
C*+
G c
om
peti
tor
C*+
non
sp
eci
fic
com
peti
tor
G*-
com
peti
tor
G*+
C c
om
peti
tor
G*+
G c
om
peti
tor
G*+
non
sp
eci
fic
com
peti
tor
Genetic association
Functionalassociation
Ge et al. Nature Genetics 2009
Functional association
Potential mechanism
Verlaan et al. AJHG, 2009
Creutzfeld-Jacob’s disease: PRNPLDL cholesterol: HMGCRCRP levels: IL6RCrohn’s disease: IL23RPlasma homocysteine: CBSTooth development: HOXB2
Creutzfeld-Jacob’s disease: PRNPLDL cholesterol: HMGCRCRP levels: IL6RCrohn’s disease: IL23RPlasma homocysteine: CBSTooth development: HOXB2
CIS-rSNPs POTENTIALLYEXPLAINING DISEASEASSOCIATIONS ARE ENRICHED FOR TISSUESPECIFIC VARIANTS
CIS-rSNPs POTENTIALLYEXPLAINING DISEASEASSOCIATIONS ARE ENRICHED FOR TISSUESPECIFIC VARIANTS
Preliminary observations:Examples:
common haplotypes harbor functional alleles altering cis-regulation in most human genes
cis-regulatory SNPs altering transcription can be characterized by: specific assessment of population variation in cis-regulation (AE-
mapping) fine-mapping using sequenced genomes (1000G/imputation for
common variants) intersection with functional genomic data (ENCODE)
regulatory variation in complex genomic regions (overlapping transcripts), or causing post-transcriptional effects require other tools (strand-specific assays/RNA-seq.)
large-scale, orthogonal validation tools need to catch up with mapping
McGill University and Génome Québec Innovation Centre
Pastinen LabTony Kwan, Véronique Adoue, Lisanne Morcos, Dominique J Verlaan, Tomi Pastinen, Elin Grundberg, Vonda Koka, Kevin Lam, Bing Ge
Alexandre Montpetit, Eef Harmsen, Joana Dias, Rose Hoberman, Ken Dewar
RLBP1L1 Potential new transcript
1) top associated SNP from AE-mapping2) highest scoring 1000 Genomes site
Region of active chromatin
Histone marks are tissue-specific
Region for RNA polymerase 2
binding
Highly conserved regions of regulatory potential
But most comprehensive survey of imprinting to date suggests that <100imprinted loci exist as compared to thousands of loci modulated by cis-rSNPs
Morcos et al. manuscript in prep.