SEM with Measured Genotypes

26
SEM with Measured Genotypes NIDA Workshop VIPBG, October 2012 Maes, H. H., Neale, M. C., Chen, X., Chen, J., Prescott, C. A., & Kendler, K. S. (2011). A twin association study of nicotine dependence with markers in the CHRNA3 and CHRNA5 genes. Behav Genet, 41(5), 680-690. doi: 10.1007/s10519-011-9476-z

description

SEM with Measured Genotypes. NIDA Workshop VIPBG, October 2012 Maes, H. H., Neale, M. C., Chen, X., Chen, J., Prescott, C. A., & Kendler, K. S. (2011). A twin association study of nicotine dependence with markers in the CHRNA3 and CHRNA5 genes. Behav Genet, 41 (5), 680-690. - PowerPoint PPT Presentation

Transcript of SEM with Measured Genotypes

Page 1: SEM with Measured Genotypes

SEM with Measured Genotypes• NIDA Workshop• VIPBG, October 2012

• Maes, H. H., Neale, M. C., Chen, X., Chen, J., Prescott, C. A., & Kendler, K. S. (2011). A twin association study of nicotine dependence with markers in the CHRNA3 and CHRNA5 genes. Behav Genet, 41(5), 680-690.

• doi: 10.1007/s10519-011-9476-z

Page 2: SEM with Measured Genotypes

What would ACE look like if we knew the genes and environments?

Page 3: SEM with Measured Genotypes

Where we’d like to be

Page 4: SEM with Measured Genotypes

Molecular Studies using Relatives• Association studies

• Often based on genotyping unrelated individuals• Statistical models for relatives have been extended to

included measured genotypes• van den Oord et al. (2002), Merlin (Abecasis, 2002)

• Genotype data added to twin/family studies• Increased power from family design

• Problem: • Some relatives with phenotypes without genotypes• Power of association studies can be increased if incompletely

genotyped families are retained in analyses• (Visscher et al 2008)

Page 5: SEM with Measured Genotypes

The Tobacco and Genetics Consortium Nature Genetics (2010)

Page 6: SEM with Measured Genotypes

Meta-Analyses of GWAS of Smoking• Three Meta-analyses

• TAG 2010; Thorgeirson et al. 2010; Liu et al. 2010• > 100,000 individuals

• Several genome-wide significant results• Initiation: BDNF, Cessation: DBH• Consumption

• Neuronal acetylcholine receptor subunit genes• SNPs in CHRNA5 and CHRNA3• rs16969968

• First identified by Saccone et al (2007)

Page 7: SEM with Measured Genotypes

Goal: test whether nicotine dependence is linked to nicotinic receptor variants• printACE(AceFit)

a^2 c^2 e^2 aS^2[1,] 0.62 0 0.38 0

• printACE(AcegFit) a^2 c^2 e^2 aS^2[1,] 0.61 0 0.38 0.01

• mxCompare(AcegFit, AceFit) base comparison ep minus2LL df AIC diffLL diffdf p1 ACEg <NA> 15 3386.251 730 1926.251 NA NA NA2 ACEg ACEonly 14 3390.466 731 1928.466 4.214693 1 0.0401

Page 8: SEM with Measured Genotypes

Twin Association Model• Traditional Twin Model

• Measured Genotypes as covariates in Means Model• Quantify contributions of specific variants as well as background

genetic and environmental factors

Page 9: SEM with Measured Genotypes

Twin Association Model

Page 10: SEM with Measured Genotypes

Expected Means based on allelic effects of SNPs

• Population mean = “m”• Allele at a particular locus = either “A” or “a”

• SNP effect modeled as deviations from “m”• Additive (aS) or dominant (dS) SNP effect model• Expected mean for

• AA homozygote = m + aS• Aa heterozygotes = m + dS• aa homozygote = m – aS

Page 11: SEM with Measured Genotypes

MZs: 1 of 3 classes

+aS

dS

-aS

Page 12: SEM with Measured Genotypes

DZs: 1 of 3x3 classes

Page 13: SEM with Measured Genotypes

Twin Data AvailabilityZygosity

Twin Data Availability MZ DZ

Combination Genotyped

Phenotyped twin 1 twin 2 twin 1 twin 2

1

both

both GP GP GP GP2 twin 1 GP G GP G3 twin 2 G GP G GP4 neither G G G G5

one

both GP P GP P6 twin 1 GP GP7 twin 2 G P G P8 neither G G9

neither

both P P P P10 twin 1 P P11 twin 2 P P12 neither

Page 14: SEM with Measured Genotypes

Missing Genotypes

One MZ twin genotypedOne twin or both twins phenotyped

> Assign co-twin genotype to un-genotyped co-twin

One DZ twin genotypedOne or both phenotyped

> Use allele frequencies to assign a probability of belongingness to each of 3 possible classes based on genotyped twin

Neither MZ twin genotypedOne or both twins phenotyped

> Assign probability of membership in any of the 3 possible genotype classes based on allele frequencies

Neither DZ twin genotypedOne or both twins phenotyped

> Assign probability of membership in any of the 9 possible genotype classes based on allele frequencies

Page 15: SEM with Measured Genotypes

So, how do we do add substitute values for our missing genotype data?• Need to know the expected proportions of each genotype

in the cases of twin1 and or twin2 missing for MZ and for DZ

• Need to code our data in such as way as to allow us to fill in the three possible values for missing MZ data and the 9 possible values in the case of one or more missing DZ twin genotypes in a pair.

Page 16: SEM with Measured Genotypes

Expected proportion of each genotype based on allele frequencies

Genotype Expected Proportion Expected MeanT1 T2 MZ DZ T1 T2AA AA p2 p4 + p3q + (pq)2/4 gm +aS gm +aSAA Aa 0 p3q + (pq)2/4 gm +aS gm +dSAA aa 0 (pq)2/4 gm +aS gm -aSAa AA 0 p3q + (pq)2/4 gm +dS gm +aSAa Aa 2*pq p3q + 3(pq)2 +

pq3 gm +dS gm +dSAa aa 0 pq3 + (pq)2/4 gm +dS gm -aSaa AA 0 (pq)2/4 gm -aS gm +aSaa Aa 0 pq3 + (pq)2/4 gm -aS gm +dSaa aa q2 q4 + pq3 + (pq)2/4 gm -aS gm -aS

expected proportion of each of genotypic categories of twin pairs calculated based on allele frequencies obtained from total sample of

genotyped individuals

Page 17: SEM with Measured Genotypes

Let’s look at the dataset…• str(selData)

'data.frame': 850 obs. of 9 variables: $ zyg : int 1 3 5 3 4 2 1 4 5 3 ... $ rs10a11: int 1 1 1 1 0 1 1 1 1 0 ... $ rs10a12: int 1 1 1 1 1 1 1 1 1 0 ... $ rs10a13: int 1 1 1 1 0 1 1 1 1 1 ... $ rs10a21: int 1 1 1 1 1 1 1 1 1 0 ... $ rs10a22: int 1 1 1 1 1 1 1 1 1 0 ... $ rs10a23: int 1 1 1 1 1 1 1 1 1 1 ... $ ftnd1 : int NA 5 5 7 6 NA NA NA 4 NA ... $ ftnd2 : int NA NA 5 9 NA NA NA 8 NA NA ...

Page 18: SEM with Measured Genotypes

Recode Genotypes into 3 columns (to map into the 9 genotype classes)

• rs#11 rs#12 rs#13 • if rs10a1 = 2 [AA] 1, 0, 0• if rs10a1 = 1 [Aa] 0, 1, 0• if rs10a1 = 0 [aa] 0, 0, 1• if rs10a1 = NA [??] 1, 1, 1

• mzGen1 = c(rs#11, rs#12, rs#13)• Now we can multiple these 1*3 matrices to get a 9-cell vector

with 1s in the “possible” co-twin genotypes• vector[ t(Gen1) %*% Gen2 ]

Page 19: SEM with Measured Genotypes

vector[t(Gen1)%*%Gen2]

Page 20: SEM with Measured Genotypes

Individual Proportions• mzGen1| mzGen2 > mzGenProb• mzN x 6 mzN x 9

• # mzN = number of MZ pairs• mzGenComb =

vector(t(mzGen1) %*% mzGen2)• mzGenProb =

mzGenComb %*% mzProb / (mzGenComb %*% (mzProb %*% U))

# note: “%*% U” Sums all the probabilities (Einstein addition)

Page 21: SEM with Measured Genotypes

Matrices for Genotype

# Matrices to store effect of genotype

mxMatrix(name="mean" , type="Full", nrow=nv, ncol=nv, free=T, values=0, label="gm"),mxMatrix(name="addSNP", type="Full", nrow=nv, ncol=nv, free=T, values=0, label="aS"),mxMatrix(name="domSNP", type="Full", nrow=nv, ncol=nv, free=F, values=0, label="dS"),mxMatrix(name="pSNP" , type="Full", nrow=nv, ncol=nv, free=F, values=allelep),mxAlgebra(1-pSNP, name="qSNP"),mxAlgebra(2 * pSNP * qSNP * addSNP^2, name = "S"),mxAlgebra(V+S, name="totalV"),mxAlgebra((cbind(A,C,E,S)) %x% solve(totalV), name = "stVarCom"),mxAlgebra(cbind(V, A, C, E, S, stVarCom), name = "allVarCom"),mxMatrix(name="U9", type = "Unit", nrow = 9, ncol = 1),

Page 22: SEM with Measured Genotypes

Expected Mean Vector

• # Matrix & Algebra for expected means vector and expected thresholdsmxAlgebra(rbind(mean+addSNP,mean+domSNP,mean-addSNP), name="mean3"),mxAlgebra( cbind(mean+addSNP,mean+addSNP), name="expMean_AAAA"),mxAlgebra( cbind(mean+addSNP,mean+domSNP), name="expMean_AAAa"),mxAlgebra( cbind(mean+addSNP,mean-addSNP), name="expMean_AAaa"),mxAlgebra( cbind(mean+domSNP,mean+addSNP), name="expMean_AaAA"),mxAlgebra( cbind(mean+domSNP,mean+domSNP), name="expMean_AaAa"),mxAlgebra( cbind(mean+domSNP,mean-addSNP), name="expMean_Aaaa"), mxAlgebra( cbind(mean-addSNP,mean+addSNP), name="expMean_aaAA"),mxAlgebra( cbind(mean-addSNP,mean+domSNP), name="expMean_aaAa"),mxAlgebra( cbind(mean-addSNP,mean-addSNP), name="expMean_aaaa"),

Page 23: SEM with Measured Genotypes

Expected Thresholds, Covariances

• mxMatrix( type="Full", nrow = nth, ncol = nv, free = c(F,F,rep(T,nth-2)), values=thValues, lbound=thLBound, name="Thre”),mxMatrix( type="Lower", nrow=nth, ncol=nth, free=FALSE, values=1, name="Inc" ),mxAlgebra(Inc %*% Thre, name="ThreInc"),mxAlgebra(cbind(ThreInc,ThreInc), dimnames=list(thRows,selVars), name="expThre"),# Algebra for expected variance/covariance matricesmxAlgebra((rbind(cbind(A+C+E , A+C), cbind(A+C , A+C+E))), name="expCovMZ"),mxAlgebra((rbind (cbind(A+C+E, 0.5%x%A+C), cbind(0.5%x%A+C , A+C+E))), name="expCovDZ")

Page 24: SEM with Measured Genotypes

mxModel(“MZ_”,

mxData(mzData, type="raw" ),mxModel("MZ_AA", mxFIMLObjective( "ACE.expCovMZ", "ACE.expMean_AAAA", selVars,"ACE.expThre", vector=T)),mxModel("MZ_Aa", mxFIMLObjective("ACE.expCovMZ", "ACE.expMean_AaAa", selVars,"ACE.expThre", vector=T)),mxModel("MZ_aa", mxFIMLObjective("ACE.expCovMZ","ACE.expMean_aaaa", selVars,"ACE.expThre", vector=T)),mxMatrix("Full",mzN,9,F, values=mzGenProb, name="mzWeights"),mxMatrix("Zero",mzN,1, name="Zero"),mxAlgebra(-2 * sum(log((mzWeights * cbind(MZ_AA.objective, Zero , Zero,Zero , MZ_Aa.objective, Zero,Zero , Zero , MZ_aa.objective)) %*%ACE.U9)), name="MZmix"),mxAlgebraObjective("MZmix")

Page 25: SEM with Measured Genotypes

mxModel(“DZ_”,mxData(dzData, type="raw"),mxModel("DZ_AAAA", mxFIMLObjective("ACE.expCovDZ", "ACE.expMean_AAAA", selVars, "ACE.expThre", vector=T)),mxModel("DZ_AAAa", mxFIMLObjective("ACE.expCovDZ", "ACE.expMean_AAAa", selVars, "ACE.expThre", vector=T)),mxModel("DZ_AAaa", mxFIMLObjective("ACE.expCovDZ", "ACE.expMean_AAaa", selVars, "ACE.expThre", vector=T)),mxModel("DZ_AaAA”, mxFIMLObjective("ACE.expCovDZ", "ACE.expMean_AaAA", selVars, "ACE.expThre", vector=T)),mxModel("DZ_AaAa", mxFIMLObjective("ACE.expCovDZ", "ACE.expMean_AaAa", selVars, "ACE.expThre", vector=T)),mxModel("DZ_Aaaa”, mxFIMLObjective("ACE.expCovDZ", "ACE.expMean_Aaaa", selVars, "ACE.expThre", vector=T)),mxModel("DZ_aaAA”, mxFIMLObjective("ACE.expCovDZ", "ACE.expMean_aaAA", selVars, "ACE.expThre", vector=T)),mxModel("DZ_aaAa”, mxFIMLObjective("ACE.expCovDZ", "ACE.expMean_aaAa", selVars, "ACE.expThre", vector=T)),mxModel("DZ_aaaa”, mxFIMLObjective("ACE.expCovDZ", "ACE.expMean_aaaa", selVars, "ACE.expThre", vector=T)),mxMatrix(name="dzWeights", type= "Full",nrow=dzN,ncol=9,free=F, values=dzGenProb),mxMatrix(name="Zero", type="Zero",nrow=dzN,ncol=1),mxAlgebra(name="DZmix", expression = -2*sum(log((dzWeights * cbind(DZ_AAAA.objective, DZ_AAAa.objective, DZ_AAaa.objective, DZ_AaAA.objective, DZ_AaAa.objective, DZ_Aaaa.objective, DZ_aaAA.objective, DZ_aaAa.objective, DZ_aaaa.objective)) %*%ACE.U9)), ),mxAlgebraObjective("DZmix”)

Page 26: SEM with Measured Genotypes

Goal: test whether nicotine dependence is linked to nicotinic receptor variants• mxCompare(AcegFit, AceFit)

base comparison ep minus2LL df AIC diffLL diffdf p1 ACEg <NA> 15 3386.251 730 1926.251 NA NA NA2 ACEg ACEonly 14 3390.466 731 1928.466 4.214693 1 0.0401

• printACE(AceFit) a^2 c^2 e^2 aS^2[1,] 0.62 0 0.38 0

• printACE(AcegFit) a^2 c^2 e^2 aS^2[1,] 0.61 0 0.38 0.01