Molecular & Genetic Epi 217 Association Studies: Indirect John Witte.
-
Upload
berenice-white -
Category
Documents
-
view
225 -
download
6
Transcript of Molecular & Genetic Epi 217 Association Studies: Indirect John Witte.
Molecular & Genetic Epi 217
Association Studies: Indirect
John Witte
Homework, Question 4: Haplotypes
ID MTHFR_C677T MTHFR_A1298CHaplotypes?
959 CC AA C-A / C-A
1044 CC AC C-A / C-C
147 CT AA C-A / T-A
123 CT AC C-A / T-C or
C-C / T-A
• Genotypes 677TT and 1298CC never observed together: Suggests most
Probable haplotype, and potential selection or chance.
• Rare variants: not necessarily lethal, especially those that are associated
with late onset diseases.
3 SNPs in the TAS2R38 Gene
P A V
A V I
P A I
A A V
P V I
P V V
A A I A V V
TASR: 3 SNPs form Haplotypes
P A V
A V I
Taster
Non-taster
TAS2R38 Haplotype Function
0
0.2
0.4
0.6
0.8
1
1.2
0.1 1 10 100 1000
PTC concentration (M)
Rat
io P
TC
/ S
ST
PAV
PAI
PVV
PVI
AAV
AAI
AVV
AVI
ID Taster rs10246939 rs1726866 rs713598 Haplotypes Amino Acid
10 0 CT AG CG CGG*/TAC PAV/AVI
12 1 CT AG CG CGG*/TAC PAV/AVI
14 1 . . . . .
17 0 CC GG GG CGG/CGG PAV/PAV
19 1 CT AG CG CGG*/TAC PAV/AVI
20 1 CT AG CG CGG*/TAC PAV/AVI
22 . TT AA CC TAC/TAC AVI/AVI
24 1 CC GG GG CGG/CGG PAV/PAV
26 . CT AG CG CGG*/TAC PAV/AVI
28 1 CT AG CG CGG*/TAC PAV/AVI
29 1 CC GG CG CGG/CGC PAV/PAI
30 0 TT AA CC TAC/TAC AVI/AVI
31 1 CC GG GG CGG/CGG PAV/PAV
TASR Genotyping Results
Too many MTHFR SNPsSolution: Tag SNP Selection
SNPs are correlated (aka Linkage Disequilibrium)
Carlson et al. (2004) AJHG 74:106
high r2 high r2 high r2
AATT
GC
CG
ACCC
GC
CG
TCCC
GGAA
A/T1
G/A2
G/C3
T/C4
G/C5
A/C6
Pairwise Tagging:
SNP 1SNP 3SNP 6
3 tags in total
Test for association:
SNP 1SNP 3SNP 6
Coverage: Measurement Error in TagSNPs
Common Measures of Coverage
• Threshold Measures– e.g., 73% of SNPs in the complete set are in LD with
at least one SNP in the genotyping set at r2 > 0.8
• Average Measures– e.g., Average maximum r2 = 0.84
Coverage and Sample Size
• Sample size required for Direct Association, n• Sample size for Indirect Association
n* = n/ r2
• For r2 = 0.8, increase is 25%• For r2 = 0.5, increase is 100%
Tag SNPs Database Resources
http://www.hapmap.org
http://gvs.gs.washington.edu/GVS/index.jsp
HapMap
• Re-sequencing to discover millions of additional SNPs; deposited to dbSNP.
• SNPs from dbSNP were genotyped• Looked for 1 SNP every 5kb• SNP Validation
– Polymorphic– Frequency
• Haplotype and Linkage Disequilibrium Estimation– LD tagging SNPs
HapMap Phase III Populations
• ASW African ancestry in Southwest USA • CEU Utah residents with Northern and Western
European ancestry from the CEPH collection • CHB Han Chinese in Beijing, China • CHD Chinese in Metropolitan Denver, Colorado • GIH Gujarati Indians in Houston, Texas • JPT Japanese in Tokyo, Japan • LWK Luhya in Webuye, Kenya • MEX Mexican ancestry in Los Angeles, California • MKK Maasai in Kinyawa, Kenya • TSI Toscani in Italia • YRI Yoruba in Ibadan, Nigeria
Tag SNPs: HapMap
Tag SNPs: HapMap
Tag SNPs: HapMap & Haploview
http://www.broad.mit.edu/mpg/haploview/
Tag SNPs: HapMap & Haploview
Tag SNPs: HapMap & Haploview
Tag SNPs: HapMap & Haploview
Tag SNPs: HapMap & Haploview
Identified 33 common MTHR SNPs (MAF > 5%) among Caucasians
Forced in 3 potentially functional/previously associated SNPs
Identified tag based on pairwise tagging
15 tags SNPs could capture all 33 MTHR SNPs (mean r2 = 97%)
Note: number of SNPs required varies from gene to gene and from population to population
Tag SNPs: HapMap Summary
1K Genomes Project
Genome-wide Assocation Studies (GWAS)
1,2,
3,…
……
……
……
……
,N
1,2,3,……………………………,M
SNPs
Sam
ples
One-Stage DesignOne-Stage Design
Stage 1
Sta
ge 2
samples
markers
Two-Stage DesignTwo-Stage Design
1,2,3,……………………………,M
SNPs
Sam
ples
1,2,
3,…
……
……
……
……
,N
One- and Two-Stage GWA DesignsOne- and Two-Stage GWA Designs
SNPs
Sam
ples
Replication-based analysisSNPs
Sam
ples
Stage 1
Stag
e 2
One-Stage DesignOne-Stage Design
Joint analysisSNPs
Sam
ples
Stage 1
Stag
e 2
Two-Stage DesignTwo-Stage Design
Multistage Designs
• Joint analysis has more power than replication
• p-value in Stage 1 must be liberal
• Lower cost—do not gain power
• http://www.sph.umich.edu/csg/abecasis/CaTS/index.html
Complex diseases
Diabetes
Obesity
Diet
Physical activity
Hypertension
Hyperlipidemia
Vulnerable plaques
Atherosclerosis MI
Genetic susceptibility
Complex diseases: Many causes = many causal pathways!
Pathways
• Many websites / companies provide ‘dynamic’ graphic models of molecular and biochemical pathways.
• Example: BioCarta: http://www.biocarta.com/
• May be interested in potential joint and/or interaction effects of multiple genes in one pathway.
Interactions
• “The interdependent operation of two or more causes to produce or prevent an effect”
• “Differences in the effects of one or more factors according to the level of the remaining factor(s)”
• Last, 2001
AA Aa aa
BB At risk At risk No risk
Bb At risk At risk No risk
bb No risk No risk No risk
Why look for interactions?
• Improve detection of genetic (& environmental) risks.• Understand etiology/biology• New hypotheses?• Diagnostics• Prevention and interventions
Dilution of effects
OR=1.5
5.2
2.1
0.1
2.8
Dri
nke
r?M
icro
nutr
ient
X
2.7
0.6Envir
onm
enta
l exposu
re Y
Gene A
19
0.1
25
21
0.2
0.1
16
Oth
er
gene Z
Within particular subgroups, effect of genemay be quite high or low
Statistical vs. Biological Interactions• Not identical. • One hypothesizes biological interaction
• But ‘tests’ for statistical interaction
• Does statistical evidence support our biological hypothesis?
Multiplicative vs. Additive Interactions
g G
e 1.0 1.4
E 2.0 2.4
g G
e 1.0 1.4
E 2.0 2.8
g G
e 1.0 1.4
E 2.0 7.8
Multiplicative “effect”(ORs, RRs)
Multiplicative interaction(ORs, RRs)
2.8/2.01.4/1.0
= = 1.0
7.8/2.01.4/1.0
= = 2.8
Departure from =1 is a multiplicative interaction
Additive “effect”
RER = (OR(E,G)-1)/((OR(E,g)-1)+(OR(e,G)-1))
= (2.4-1)/((2.0-1)+(1.4-1)) = 1.0RER = relative excess risk
Brennan, P. Carcinogenesis 2002 23:381-387
Two possible causal pathways: additive and multiplicative interaction for colorectal cancer
Additive interaction: G1 and E5: independent risk factors
Multiplicative interaction: G2 and E2: work through same pathway
If factors are not known to act independently, use multiplicative.
Analysis of Multiple Genes
• Joint / Additive
• Multiplicative
• Increasing complexity
321)(1
)(ln 321 GGG
DP
DP
3*13*22*1
321)(1
)(ln
321
321
GGGGGG
GGGDP
DP
More Complex Modeling
• Multifactor-dimensionality reduction– (Moore & Williams, Ann Med 2002)
• Logic regression– (Kooperberg & Ruczinski, Genetic Epi 2005)
• Multi-loci analysis– (Marchini, Donnelly, Cardon, Nat Genet 2005)
• Bayesian epistasis association mapping – (Zhang & Liu, Nat Genet 2007)