X ia - genotype of i-th individual at locus a X ia = 1/2 - individual is heterozygous at locus a

28
Modifying the Schwarz Bayesian Information Criterion to locate multiple interacting Quantitative Trait Loci 1. M.Bogdan, J.K.Ghosh and R.W.Doerge, Genetics 2004 167: 989-999. 2. M.Bogdan and R.W.Doerge “Mapping multiple interacting QTL by multidimensional genome searches’’

description

Modifying the Schwarz Bayesian Information Criterion to locate multiple interacting Quantitative Trait Loci 1. M.Bogdan, J.K.Ghosh and R.W.Doerge, Genetics 2004 167: 989-999. 2. M.Bogdan and R.W.Doerge “Mapping multiple interacting QTL by multidimensional genome searches’’. - PowerPoint PPT Presentation

Transcript of X ia - genotype of i-th individual at locus a X ia = 1/2 - individual is heterozygous at locus a

Page 1: X ia - genotype of i-th individual at locus a X ia  = 1/2 - individual is heterozygous at locus a

Modifying the Schwarz Bayesian Information Criterion to locate multiple interacting

Quantitative Trait Loci

1. M.Bogdan, J.K.Ghosh and R.W.Doerge,Genetics 2004 167: 989-999.

2. M.Bogdan and R.W.Doerge “Mapping multiple interacting QTL by multidimensional genome searches’’

Page 2: X ia - genotype of i-th individual at locus a X ia  = 1/2 - individual is heterozygous at locus a

Xia- genotype of i-th individual at locus a

Xia = 1/2 - individual is heterozygous at locus a

Xia = -1/2 - individual is homozygous at locus a

dab=10 cM - ρ (Xia, Xib) = 0.81

Data for QTL mapping

Y1,...,Yn - vector of trait values for n backcross individuals

X=[Xij], 1 ≤ i ≤ n, 1 ≤ j ≤ m - genotypes of m markers

Page 3: X ia - genotype of i-th individual at locus a X ia  = 1/2 - individual is heterozygous at locus a

Standard methods of QTL mapping One QTL model

2(1) Q , (0, )

Q (-1/2,1/2) - QTL genotypei i i i

i

Y N

1. Search over markers - fit model (1) at each marker and choose markers for which the likelihood exceeds a preestablished threshold value as candidate

QTL locations.

Page 4: X ia - genotype of i-th individual at locus a X ia  = 1/2 - individual is heterozygous at locus a

Interval mapping Lander and Botstein (1989)

• Consider a fixed position between markers

- state of flanking markers

1 1 1 1 1 1 1 1, , , , , , ,

2 2 2 2 2 2 2 2

1(Q | ) easy to compute

2

i

i

i i i

I

I

p P I

Page 5: X ia - genotype of i-th individual at locus a X ia  = 1/2 - individual is heterozygous at locus a

2

2 2

1

Q , (0, )

1 1( | ) ( , ) (1 ) ( , )

2 2

( | ) ( | )

i i i i

i i i i

n

i ii

Y N

f Y I p N p N

L Y I f Y I

1. Estimate μ, β, and σ by EM algorithm and compute the corresponding likelihood.

2. Repeat this procedure for a new possible QTL location.

3. Plot the resulting likelihoods as the function of assumed QTL position.

Page 6: X ia - genotype of i-th individual at locus a X ia  = 1/2 - individual is heterozygous at locus a
Page 7: X ia - genotype of i-th individual at locus a X ia  = 1/2 - individual is heterozygous at locus a

• Problems with interval mapping

a) Not able to distingush closely linked QTL

b) Not able to detect epistatic QTL (involved only in interactions)

• Solution

Estimate the location of several QTL at once using multiple regression model (Kao et al. 1999)

p r

i j ij jl ij ilj 1 1 j<l m

Y μ β γ εiQ Q Q

Page 8: X ia - genotype of i-th individual at locus a X ia  = 1/2 - individual is heterozygous at locus a
Page 9: X ia - genotype of i-th individual at locus a X ia  = 1/2 - individual is heterozygous at locus a
Page 10: X ia - genotype of i-th individual at locus a X ia  = 1/2 - individual is heterozygous at locus a
Page 11: X ia - genotype of i-th individual at locus a X ia  = 1/2 - individual is heterozygous at locus a
Page 12: X ia - genotype of i-th individual at locus a X ia  = 1/2 - individual is heterozygous at locus a

Problem : estimation of the number of additive and interaction terms

iεXXγXβμY jjj iuik

p

1j

r

1jjihji

Xij - genotype of j-th marker

average number of markers - (200,400)

Page 13: X ia - genotype of i-th individual at locus a X ia  = 1/2 - individual is heterozygous at locus a

Bayesian Information Criterion

• Choose the model which maximizes

log L -1/2 k log n

L – likelihood of the data for a given model

k – number of parameters in the model

n – sample size

Broman (1997) and Broman and Speed (2002) – BIC overestimates QTL number

Page 14: X ia - genotype of i-th individual at locus a X ia  = 1/2 - individual is heterozygous at locus a

How to modify BIC ?

Mi – i-th linear model (specifies which markers

are included in regression)

θ = (μ, β1,..., βp, γ1,..., γr, σ) – vector of parameters

for Mi

fi(θ) – density of the prior distribution for θ

π(i) – prior probability of Mi

Page 15: X ia - genotype of i-th individual at locus a X ia  = 1/2 - individual is heterozygous at locus a

L(Y|θ) – likelihood of the data given the vector

of paramers θ

mi(Y) – likelihood of the data given the model Mi

P(Mi|Y) π(i)mi(Y)

BIC neglects π(i) and uses asymptotic approximation

θ)dθ(θ)f|L(Y(Y)m ii

n 2)logr1/2(p)θ̂L(Y, log(Y)m log i

Page 16: X ia - genotype of i-th individual at locus a X ia  = 1/2 - individual is heterozygous at locus a

neglecting π(i) = assigning the same prior probabilityto all models = assigning high prior probability to the

event that there are many regressors

Example : 200 markers

200 models with one additive term

=19 900 models with one interaction or with two additive terms

= 9.05*1058 models with 100 additive terms 

2

200

100

200

Page 17: X ia - genotype of i-th individual at locus a X ia  = 1/2 - individual is heterozygous at locus a

 Idea: supplement BIC with a more realistic prior

distribution π

)(log2log))()((log)(

regression from squares of sum residual

)(log2

)ˆ,(log

log))()((2

1)ˆ,(log)(log)(

~

iniripRSSniS

RSS

nCRSSn

YL

niripYLiiS

Page 18: X ia - genotype of i-th individual at locus a X ia  = 1/2 - individual is heterozygous at locus a

Choice of π (George and McCulloch, 1993)

M – number of markers

2

1)M(MN

- number of potential interactions

α - the probability that i-th additive term appears in the model

ν - the probability that j-th interaction term appears in the model

π(M)= αp νr(1-α)M-p (1-ν)N-r

M- model with p additive terms and r interactions

Page 19: X ia - genotype of i-th individual at locus a X ia  = 1/2 - individual is heterozygous at locus a

We choose Nuu

Nll

,1

and ,1

log π(M)=C(M,N,l,u)-p log(l-1)-r log(u-1)

)1log(2)1log(2

log)(log)(

urlp

nrpRSSniS

Prior distribution on the number of additive terms, p –Binomial (M,α)

Prior distribution on the number of interactions, r –Binomial (N,ν)

Page 20: X ia - genotype of i-th individual at locus a X ia  = 1/2 - individual is heterozygous at locus a

Choice of l and u should depend on the prior knowledge on the number of QTL.

u

N, E(r)

l

ME(p)

Our choice – for the sample size 200probability of wrongly detecting QTL (when there are

none) ≈ 0.05

We keep E(p) and E(r) equal to 2.2

The choice is supported by theoretical bound on type I error based on Bonferoni inequality.

Page 21: X ia - genotype of i-th individual at locus a X ia  = 1/2 - individual is heterozygous at locus a

( ) log ( ) log

2 log( / 2.2) 2 log( / 2.2)

S i n RSS p r n

p M r N

Additional penalty similar to Risk Inflation Criterion of Foster and George (2k log t , where t is the total

number of available regressors) and to the modification of BIC proposed by Siegmund (2004).

Page 22: X ia - genotype of i-th individual at locus a X ia  = 1/2 - individual is heterozygous at locus a

Search over 12 chromosomesmarkers spaced every 10 cM

n h2 p corr. extr r corr extr

200 0 0 0.95 0.03 0 - 0.02

500 0 0 0.99 0.01 0 - 0

200 0.2 1 1 0.03 0 0 0.02

200 0.195 0 - 0.01 1 0.95 0.04

Page 23: X ia - genotype of i-th individual at locus a X ia  = 1/2 - individual is heterozygous at locus a

n h2 p corr extr r corr extr

200 0.55 0 - 0.02 3 2.88 0.08

200 0.5 7 5.06 0.26 0 - 0.09

500 0.5 7 6.99 0.14 0 - 0.03

200 0.43 12 2.39 0.31 0 - 0.03

500 0.43 12 9.68 0.47 0 - 0.02

200 0.71 12 9.53 0.75 0 - 0.02

200 0.53 2 1.95 0.04 5 2.11 0.11

500 0.53 2 2 0.03 5 3.47 0.08

Page 24: X ia - genotype of i-th individual at locus a X ia  = 1/2 - individual is heterozygous at locus a

• The criterion adjusts well to the number of available markers

• For n = 200 the criterion detects almost all additive QTL with individual h2 =0.13 and interactions with h2 =0.2.

• For n = 500 the criterion detects almost all additive QTL with individual h2 =0.06 and interactions with h2 =0.12.

Page 25: X ia - genotype of i-th individual at locus a X ia  = 1/2 - individual is heterozygous at locus a

Bound for the type I error

1

0 0

the maximum of the criterion over

all one dimensional models

ˆ ˆ= log L ( / , ) the value of the criterion

for the null model

- the number of terms chosen by our criterion

S

S Y

D

P

1 0( 0) ( )D P S S

Page 26: X ia - genotype of i-th individual at locus a X ia  = 1/2 - individual is heterozygous at locus a

0

0 0

0

- the value of the criterion for a

given one dimensional model

if

ˆ( / )2 log log 2(log( 1) or log( 1))

ˆ( / )

( )

2 ( log 2(log( 1) or log( 1)))

where (0,1)

i

i

i

i

M

M

M

M

S

S S

L Yn l u

L Y

P S S

P Z n l u

Z N

Page 27: X ia - genotype of i-th individual at locus a X ia  = 1/2 - individual is heterozygous at locus a

2

1 01 2

By Bonferoni inequality and the bound

1P(Z>x) exp( )

222 2

( )( 1) ( , ) ( 1) ( , )

x

xM N

P S Sl C l n u C u n

Page 28: X ia - genotype of i-th individual at locus a X ia  = 1/2 - individual is heterozygous at locus a

1 0

, 2.2 2.2

( )

4.4 1 1

2 log 2log( 1) log 2log( 1)

M Nl u

P S S

n n l n u

For n=200 and typical values of M this yields values in the range between 0.057 and 0.08.