FST and kinship for arbitrary population structures2017/02/13  · C) European populations FIN FIN...

41
F ST and kinship for arbitrary population structures Alejandro Ochoa John D. Storey Lab Center for Statistics and Machine Learning, and Lewis-Sigler Institute for Integrative Genomics, Princeton University 2017-02-13

Transcript of FST and kinship for arbitrary population structures2017/02/13  · C) European populations FIN FIN...

Page 1: FST and kinship for arbitrary population structures2017/02/13  · C) European populations FIN FIN GBR GBR IBS IBS TSI TSI -0.05 0 0.05 0.1 0.15 pairwise F S T individuals pairwise

FST and kinship for arbitrary population structures

Alejandro Ochoa

John D. Storey Lab

Center for Statistics and Machine Learning, andLewis-Sigler Institute for Integrative Genomics,

Princeton University

2017-02-13

Page 2: FST and kinship for arbitrary population structures2017/02/13  · C) European populations FIN FIN GBR GBR IBS IBS TSI TSI -0.05 0 0.05 0.1 0.15 pairwise F S T individuals pairwise

Why study FST and kinship?

Human geneticsis fascinating!

Pop. structureconfoundsassociationstudies (GWAS)

Heritability ofcomplex traits

Animal and plantbreeding

Page 3: FST and kinship for arbitrary population structures2017/02/13  · C) European populations FIN FIN GBR GBR IBS IBS TSI TSI -0.05 0 0.05 0.1 0.15 pairwise F S T individuals pairwise

FST measures population structure and differentiation

●●

●●

●●

●●●●

●●

●●

●●●●●

●●●

●●

●●●

●●

● ●

●●

●●

●●

●●

00.

20.

4

Alle

le fr

eque

ncy

Median differentiation SNP (rs11692531)F̂ST ≈ 0.081 using Weir-Cockerham estimatorHuman Genome Diversity Project (HGDP)

Page 4: FST and kinship for arbitrary population structures2017/02/13  · C) European populations FIN FIN GBR GBR IBS IBS TSI TSI -0.05 0 0.05 0.1 0.15 pairwise F S T individuals pairwise

FST in the independent subpopulation model

●●●

●●

0.4

0.6

Alle

le fr

eque

ncy

Illustration.

FST =Var(pSi∣∣T)

pTi(1− pTi

) .

Page 5: FST and kinship for arbitrary population structures2017/02/13  · C) European populations FIN FIN GBR GBR IBS IBS TSI TSI -0.05 0 0.05 0.1 0.15 pairwise F S T individuals pairwise

FST estimation is constrained to independent subpopulations

T

S1

f TS1

S2

f TS2

...SK

f TSK

Indep. Subpops.T

S1 S2 ... SK

A1 A2 ... An

Admixture

00.

050.

150.

25

Kin

ship

Indi

vidu

als

Page 6: FST and kinship for arbitrary population structures2017/02/13  · C) European populations FIN FIN GBR GBR IBS IBS TSI TSI -0.05 0 0.05 0.1 0.15 pairwise F S T individuals pairwise

Our contribution

Previous FST definitions/estimators assume independent subpopulations.

1. We generalize FST for arbitrary populations, in terms of individuals.2. We characterize the bias of popular estimators under arbitrary population

structure, through theory and simulations.3. We develop a new estimator of kinship and FST for arbitrary population

structures.

Page 7: FST and kinship for arbitrary population structures2017/02/13  · C) European populations FIN FIN GBR GBR IBS IBS TSI TSI -0.05 0 0.05 0.1 0.15 pairwise F S T individuals pairwise

Confusion: three versions of FST

Definition 1: FST as ameasure of relatedness in apopulation

FST = f̄ TS = θT or θ̄T .

Initially estimated frompedigrees.

Definition 2: FST as aparameter controllingallelic variance

FST =Var(pSi∣∣T)

pTi(1− pTi

) .Def. 1 ⇒ Def. 2 with FST

I Shared across loci i .I No µ or selection.

Definition 3: FST as astatistic of locus-specificvariance

FST,i =σ̂2i

p̄i(1− p̄i).

Goals:I Varies per locus i .I Measures µ and

selection.

Our generalized definition corresponds most closely to Definition 1.

Page 8: FST and kinship for arbitrary population structures2017/02/13  · C) European populations FIN FIN GBR GBR IBS IBS TSI TSI -0.05 0 0.05 0.1 0.15 pairwise F S T individuals pairwise

Wright’s FST in cattle

Populations:T : ShorthornS : Dutchessstrain

Wright (1951)

Page 9: FST and kinship for arbitrary population structures2017/02/13  · C) European populations FIN FIN GBR GBR IBS IBS TSI TSI -0.05 0 0.05 0.1 0.15 pairwise F S T individuals pairwise

Populations related by a treeT

Ljk

Lj

Lk

kinship orcoancestry

...inbreeding

Page 10: FST and kinship for arbitrary population structures2017/02/13  · C) European populations FIN FIN GBR GBR IBS IBS TSI TSI -0.05 0 0.05 0.1 0.15 pairwise F S T individuals pairwise

FST in a subdivided population: Wright (1951)T

S

I

FIS ...

FST ...FIT

(1− FIT) = (1− FIS)(1− FST)

Page 11: FST and kinship for arbitrary population structures2017/02/13  · C) European populations FIN FIN GBR GBR IBS IBS TSI TSI -0.05 0 0.05 0.1 0.15 pairwise F S T individuals pairwise

Admixed populations have complex structures

US individuals are often admixed from populations across the world.I European: UK, Ireland, Germany, ItalyI African: West AfricaI Hispanic: Puerto Rico, MexicoI Asian: China, India

African-Americans and Hispanics are recently admixed (5-15 generations ago)from differentiated populations.

Admixture proportions vary (admix. LD) ⇒ complex kinship.

GWAS and heritability estimation in multiethnic or admixed data?

Page 12: FST and kinship for arbitrary population structures2017/02/13  · C) European populations FIN FIN GBR GBR IBS IBS TSI TSI -0.05 0 0.05 0.1 0.15 pairwise F S T individuals pairwise

Recently admixed populations

African-Americans

Baharian, et al. (2016)

Hispanics

Moreno-Estrada, et al. (2013)

Page 13: FST and kinship for arbitrary population structures2017/02/13  · C) European populations FIN FIN GBR GBR IBS IBS TSI TSI -0.05 0 0.05 0.1 0.15 pairwise F S T individuals pairwise

Admixed siblings from different populations?

Lucy and Maria, UK

Ochoa brothers, MX

High Admixture LD:

Moreno-Estrada, et al. (2013)

Solution: treat every individual as its own population!

Page 14: FST and kinship for arbitrary population structures2017/02/13  · C) European populations FIN FIN GBR GBR IBS IBS TSI TSI -0.05 0 0.05 0.1 0.15 pairwise F S T individuals pairwise

SNP dataExample: Genotype CC CT TT

xij 0 1 2Genome\Wide(SNP(Data(

0 2 2 1 1 0 1 0 2 1 0 1 2 . . .

Individuals

SN

Ps

X

Page 15: FST and kinship for arbitrary population structures2017/02/13  · C) European populations FIN FIN GBR GBR IBS IBS TSI TSI -0.05 0 0.05 0.1 0.15 pairwise F S T individuals pairwise

An unstructured population

Individuals mate randomly.

In a large population T , genotypes

xij ∼ Binomial(2, pTi ),

at SNP i with reference allele frequencypTi , for any individual j .

This is “Hardy-Weinberg Equilibrium”. 0 1 2

pi = 0.25

Genotype ( xij)

Pro

babi

lity

0.0

0.2

0.4

0.6

Page 16: FST and kinship for arbitrary population structures2017/02/13  · C) European populations FIN FIN GBR GBR IBS IBS TSI TSI -0.05 0 0.05 0.1 0.15 pairwise F S T individuals pairwise

Inbreeding coefficient f Tj

f Tj : Probability that the two alleles ofindividual j at a random SNP are“identical by descent” (IBD) given anancestral population T .

A structured population has f Tj > 0.

0 1 2

pi = 0.25, fj = 0.5

Genotype ( xij)

Pro

babi

lity

0.0

0.2

0.4

0.6

Page 17: FST and kinship for arbitrary population structures2017/02/13  · C) European populations FIN FIN GBR GBR IBS IBS TSI TSI -0.05 0 0.05 0.1 0.15 pairwise F S T individuals pairwise

Kinship coefficients ϕTjk

ϕTjk : Probability that one allele of

individual j and one of individual k , at arandom SNP, are IBD, given anancestral population T .

Local kinship,given unrelated founders

j , k relation ϕTjk

self 1/2child 1/4sibling 1/4

half sibling 1/8uncle or nephew 1/8first cousins 1/16

second cousins 1/64unrelated 0

Page 18: FST and kinship for arbitrary population structures2017/02/13  · C) European populations FIN FIN GBR GBR IBS IBS TSI TSI -0.05 0 0.05 0.1 0.15 pairwise F S T individuals pairwise

Kinship model for genotypes

Let T be the ancestral population. In the absence of selection or mutation, allelefrequencies drift randomly from the ancestral frequency pTi , with covariancesmodulated by the kinship coefficients:

E[xij |T ] = 2pTi ,

Var(xij |T ) = 2pTi(1− pTi

)(1 + f Tj ),

Cov(xij , xik |T ) = 4pTi(1− pTi

)ϕTjk .

Note that ϕTjj = 1

2

(1 + f Tj

).

(Wright 1921, Malécot 1948, Wright 1951, Jacquard 1970).

Page 19: FST and kinship for arbitrary population structures2017/02/13  · C) European populations FIN FIN GBR GBR IBS IBS TSI TSI -0.05 0 0.05 0.1 0.15 pairwise F S T individuals pairwise

Individual-level analogs of FIT, FIS, FST“Total” coef., analogous to FIT:f Tj and ϕT

jk are relative to T .

“Local” coef., analogous to FIS:fLjj is relative to Lj ,ϕLjkjk is relative to Ljk .

“Structural” coef., analogous to FST:

f TLj =f Tj − f

Ljj

1− fLjj

,

f TLjk =ϕTjk − ϕ

Ljkjk

1− ϕLjkjk

.

T

Ljk

Lj

fLjkLj

Lk

fLjkLk

f TLjk ...f TLj

Page 20: FST and kinship for arbitrary population structures2017/02/13  · C) European populations FIN FIN GBR GBR IBS IBS TSI TSI -0.05 0 0.05 0.1 0.15 pairwise F S T individuals pairwise

FST for arbitrary population structures

We propose

FST =n∑

j=1

wj fTLj,

whereI f TLj = inbreeding coefficient of Lj relative to T

I wj ≥ 0,∑n

j=1 wj = 1 are weights

Backward compatible with FST for subpopulations.Coherent with Wright’s 1951 definition.

Page 21: FST and kinship for arbitrary population structures2017/02/13  · C) European populations FIN FIN GBR GBR IBS IBS TSI TSI -0.05 0 0.05 0.1 0.15 pairwise F S T individuals pairwise

Coancestry model and individual allele frequencies

This restricted model assumes the existence of individual-specific allelefrequencies πij , modulated by coancestry coefficients θTjk :

E[πij |T ] = pTi ,

Cov(πij , πik |T ) = pTi(1− pTi

)θTjk ,

xij |πij ∼ Binomial(2, πij).

This model excludes local relationships. Given these assumptions, coancestryand kinship coefficients are the same:

θTjk =

{ϕTjk if j 6= k ,

f Tj = 2ϕTjj − 1 if j = k .

FST =n∑

j=1

wjθTjj

Page 22: FST and kinship for arbitrary population structures2017/02/13  · C) European populations FIN FIN GBR GBR IBS IBS TSI TSI -0.05 0 0.05 0.1 0.15 pairwise F S T individuals pairwise

FST estimation under independent subpopulationsWeir-Cockerham and Hudson FST

estimators with πij simplify to

p̂Ti =1n

n∑j=1

πij ,

σ̂2i =

1n − 1

n∑j=1

(πij − p̂Ti

)2,

F̂ indepST =

m∑i=1

σ̂2i

m∑i=1

p̂Ti(1− p̂Ti

)+ 1

nσ̂2i

a.s.−−−→m→∞

FST.

Under independent subpopulations, FST

can be solved for:

E

[1m

m∑i=1

σ̂2i

]= p(1− p)

TFST,

E

[1m

m∑i=1

p̂Ti(1− p̂Ti

)]= p(1− p)

T(1− FST

n

)

Page 23: FST and kinship for arbitrary population structures2017/02/13  · C) European populations FIN FIN GBR GBR IBS IBS TSI TSI -0.05 0 0.05 0.1 0.15 pairwise F S T individuals pairwise

FST estimation under arbitrary coancestryWeir-Cockerham and Hudson FST

estimators with πij simplify to

p̂Ti =1n

n∑j=1

πij ,

σ̂2i =

1n − 1

n∑j=1

(πij − p̂Ti

)2,

F̂ indepST =

m∑i=1

σ̂2i

m∑i=1

p̂Ti(1− p̂Ti

)+ 1

nσ̂2i

a.s.−−−→m→∞

n(FST − θ̄T

)n − 1 + FST − nθ̄T

Under the general coancestry model,system is underdetermined:

E

[1m

m∑i=1

σ̂2i

]= p(1− p)

T n(FST − θ̄T )

n − 1,

E

[1m

m∑i=1

p̂Ti(1− p̂Ti

)]= p(1− p)

T(1− θ̄T ).

θ̄T : mean coancestry.

In independent subpopulationsθ̄T = 1

nFST.

Page 24: FST and kinship for arbitrary population structures2017/02/13  · C) European populations FIN FIN GBR GBR IBS IBS TSI TSI -0.05 0 0.05 0.1 0.15 pairwise F S T individuals pairwise

Admixture models

Draw alleles from a mixture of populations:

πij =K∑

u=1

pSui qju,

where qju is ancestry proportion, pSui is AF in subpopulation Su.If subpopulations are independent and f TSu is FST of Su relative to T , then

θTjk =K∑

u=1

qjuqkufTSu , FST =

n∑j=1

K∑u=1

wjq2juf

TSu .

Page 25: FST and kinship for arbitrary population structures2017/02/13  · C) European populations FIN FIN GBR GBR IBS IBS TSI TSI -0.05 0 0.05 0.1 0.15 pairwise F S T individuals pairwise

Our admixture simulationA) Intermediate population differentiation

FS

T

0.0

1.0

0.0

0.2

B) Intermediate population spread

dens

ity

C) Admixture proportions

ance

stry

0.0

1.0

D) Discrete subpopulations approx.

ance

stry

0.0

1.0

position in 1D geography

Page 26: FST and kinship for arbitrary population structures2017/02/13  · C) European populations FIN FIN GBR GBR IBS IBS TSI TSI -0.05 0 0.05 0.1 0.15 pairwise F S T individuals pairwise

Comparison of population structures in simulation

T

S1

f TS1

S2

f TS2

...SK

f TSK

Indep. Subpops.T

S1 S2 ... SK

A1 A2 ... An

Admixture

00.

050.

150.

25

Kin

ship

Indi

vidu

als

Page 27: FST and kinship for arbitrary population structures2017/02/13  · C) European populations FIN FIN GBR GBR IBS IBS TSI TSI -0.05 0 0.05 0.1 0.15 pairwise F S T individuals pairwise

Bias estimating the generalized FSTThe popular Weir-Cockerham (WC) and Hudson FST estimators, formulated forindependent subpopulations, are biased in our admixture simulation:

0.00

0.05

0.10

A) Indep. Subpops.

Wei

r−C

ocke

rham

Hud

son

Kin

ship

Plu

g−in

− −−

B) Admixture

Wei

r−C

ocke

rham

Hud

son

Kin

ship

Plu

g−in

− − −

True FST

FSTindep limit

FS

T e

stim

ate

FST estimator

Page 28: FST and kinship for arbitrary population structures2017/02/13  · C) European populations FIN FIN GBR GBR IBS IBS TSI TSI -0.05 0 0.05 0.1 0.15 pairwise F S T individuals pairwise

Bias estimating kinship coefficientsThe popular kinship estimator from genotypes and its limit are

ϕ̂Tjk =

m∑i=1

(xij − 2p̂Ti

) (xik − 2p̂Ti

)4

m∑i=1

p̂Ti(1− p̂Ti

) a.s.−−−→m→∞

ϕTjk − ϕ̄T

j − ϕ̄Tk + ϕ̄T

1− ϕ̄T,

where ϕ̄Tj and ϕ̄T are weighted mean kinships. In our admixture simulation:

A) True Kinship B) Estimate C) Limit

−0.0

50.

10.

2

Kin

ship

indi

vidu

als

Page 29: FST and kinship for arbitrary population structures2017/02/13  · C) European populations FIN FIN GBR GBR IBS IBS TSI TSI -0.05 0 0.05 0.1 0.15 pairwise F S T individuals pairwise

A new kinship estimator

Bias in new kinship estimator is parametrized by ϕ̄T :

ϕ̂T ,Oldjk =

m∑i=1

(xij − 2p̂Ti

) (xik − 2p̂Ti

)4

m∑i=1

p̂Ti(1− p̂Ti

) a.s.−−−→m→∞

ϕTjk − ϕ̄T

j − ϕ̄Tk + ϕ̄T

1− ϕ̄T,

ϕ̂T ,Newjk =

m∑i=1

(xij − 1)(xik − 1)− 1

4m∑i=1

p̂Ti(1− p̂Ti

) + 1 a.s.−−−→m→∞

ϕTjk − ϕ̄T

1− ϕ̄T.

Remaining bias in ϕ̂T ,Newjk comes from estimating pTi

(1− pTi

)with p̂Ti

(1− p̂Ti

).

Page 30: FST and kinship for arbitrary population structures2017/02/13  · C) European populations FIN FIN GBR GBR IBS IBS TSI TSI -0.05 0 0.05 0.1 0.15 pairwise F S T individuals pairwise

A new kinship estimator

Limit of proposed estimate:

ϕ̂T ,Newjk =

m∑i=1

(xij − 1)(xik − 1)− 1

4m∑i=1

p̂Ti(1− p̂Ti

) + 1 a.s.−−−→m→∞

ϕTjk − ϕ̄T

1− ϕ̄T,

If minj ,k

ϕTjk = 0, then

minj ,k

ϕ̂T ,Newjk

a.s.−−−→m→∞

−ϕ̄T

1− ϕ̄T

Page 31: FST and kinship for arbitrary population structures2017/02/13  · C) European populations FIN FIN GBR GBR IBS IBS TSI TSI -0.05 0 0.05 0.1 0.15 pairwise F S T individuals pairwise

Performance of proposed estimator

A) Truth B) Proposed C) Existing

−0.1

00.

10.

20.

3

Kin

ship

Indi

vidu

als

Page 32: FST and kinship for arbitrary population structures2017/02/13  · C) European populations FIN FIN GBR GBR IBS IBS TSI TSI -0.05 0 0.05 0.1 0.15 pairwise F S T individuals pairwise

Population-level and Individual-level distances in 1000 GenomesA) Distant populations

YRI

YR

I

CEU

CE

U

CHB

CH

B

B) Hispanic populations

PEL

PE

L

MXL

MX

L

CLM

CLM

PUR

PU

R

C) European populations

FIN

FIN

GBR

GB

R

IBS

IBS

TSI

TS

I

−0.0

50

0.05

0.1

0.15

pairw

ise

FS

T

indi

vidu

als

pairw

ise

FS

T

CH

B−C

HB

CE

U−C

EU

YR

I−Y

RI

CH

B−C

EU

CE

U−Y

RI

CH

B−Y

RI

Population−level estimateMean of individual−levelestimates

D) Pairwise comparisons

0.0

0.1

0.2

PU

R−P

UR

PE

L−P

EL

CLM

−CLM

MX

L−M

XL

PU

R−C

LM

CLM

−MX

L

MX

L−P

EL

PU

R−M

XL

CLM

−PE

L

PU

R−P

EL

E) Pairwise comparisons

0.00

0.05

FIN

−FIN

TS

I−T

SI

GB

R−G

BR

IBS

−IB

S

TS

I−IB

S

IBS

−GB

R

TS

I−G

BR

GB

R−F

IN

IBS

−FIN

TS

I−F

IN

F) Pairwise comparisons

0.00

00.

005

0.01

0

Page 33: FST and kinship for arbitrary population structures2017/02/13  · C) European populations FIN FIN GBR GBR IBS IBS TSI TSI -0.05 0 0.05 0.1 0.15 pairwise F S T individuals pairwise

Revised FST estimates in 1000 Genomes

A) Independent Pop Model

AFR

AF

R

SAS

SA

S

AMR

AM

R

EUR

EU

R

EAS

EA

S

B) 1000 Genomes

AFR

AF

R

SAS

SA

S

AMR

AM

R

EUR

EU

R

EAS

EA

S

00.

050.

10.

150.

20.

25

Kin

ship

0.0 0.1 0.2 0.3

05

1015

C) Differentiation

FST or Inbreeding

Den

sity

WCHudsonNew (indiv)New (mean)

Indi

vidu

als

Page 34: FST and kinship for arbitrary population structures2017/02/13  · C) European populations FIN FIN GBR GBR IBS IBS TSI TSI -0.05 0 0.05 0.1 0.15 pairwise F S T individuals pairwise

We have...

...generalized FST using parameters for arbitrary structure in terms of individuals.

...connected FST, kinship coefficients, and admixture models.

...characterized bias of common estimators when assumptions are broken.

...used an admixture simulation to illustrate biases.

...developed new estimators of FST and kinship/coancestry.

Page 35: FST and kinship for arbitrary population structures2017/02/13  · C) European populations FIN FIN GBR GBR IBS IBS TSI TSI -0.05 0 0.05 0.1 0.15 pairwise F S T individuals pairwise

Other work from Dr. OchoaModeling the placebo response inpsychiatric drug trialsCollaboration with Otsuka Pharma.

0 5 10 15

5010

015

020

0

Week

PAN

SS

Study

31−93−20231−97−20131−97−202CN138−001CN138−11331−03−239CN138−39731−12−291

Protein sequence analysisImproving sequence homology stats

AB R E

B

B R Globular

H H ... H H TM

H R R H C R C

CC

E

Page 36: FST and kinship for arbitrary population structures2017/02/13  · C) European populations FIN FIN GBR GBR IBS IBS TSI TSI -0.05 0 0.05 0.1 0.15 pairwise F S T individuals pairwise

Future work: Selection tests

xi : genotype vector at SNP i ,Φ̂T : kinship matrix estimate,p̂Ti : ancestral allele frequency estimate,

Then this generalized z-score measures deviation of this SNP from the neutralgenetic structure:

z2i =

(xi − 2p̂Ti 1

)ᵀ (Φ̂T)−1 (

xi − 2p̂Ti 1)

4p̂Ti(1− p̂Ti

) .

Complements other info such as selective sweeps.

Page 37: FST and kinship for arbitrary population structures2017/02/13  · C) European populations FIN FIN GBR GBR IBS IBS TSI TSI -0.05 0 0.05 0.1 0.15 pairwise F S T individuals pairwise

Future work: Admixture LD

Moreno-Estrada, et al. (2013)

Simple extension:The kinship matrix varies per locusdepending on population assignments.

More general local kinship estimation?

Page 38: FST and kinship for arbitrary population structures2017/02/13  · C) European populations FIN FIN GBR GBR IBS IBS TSI TSI -0.05 0 0.05 0.1 0.15 pairwise F S T individuals pairwise

Future work: Kinship in Recent Mutations

Recall the following only holds for neutral SNPs polymorphic in T :

E[xij |T ] = 2pTi ,

Cov(xij , xik |T ) = 4pTi(1− pTi

)ϕTjk .

A SNP that arose from recent mutation in S instead has pTi = 0 or 1 and:

E[xij |S ] = 2pSi ,

Cov(xij , xik |S) = 4pSi(1− pSi

)ϕSjk .

Also recall: (1− ϕT

jk

)=(1− ϕS

jk

) (1− f TS

).

Recent mutations require special treatment in GWAS/herit. studies!

Page 39: FST and kinship for arbitrary population structures2017/02/13  · C) European populations FIN FIN GBR GBR IBS IBS TSI TSI -0.05 0 0.05 0.1 0.15 pairwise F S T individuals pairwise

AcknowledgmentsJohn D. StoreyAndrew BassIrineo CabrerosChee ChenWei HaoEmily NelsonRiley Skeen-Gaar

Neo Christopher ChungInstitute of InformaticsUniversity of Warsaw

Funding:National Institutes of HealthOtsuka Pharmaceuticals

Lewis-Sigler Institute for Integrative Genomics

Page 40: FST and kinship for arbitrary population structures2017/02/13  · C) European populations FIN FIN GBR GBR IBS IBS TSI TSI -0.05 0 0.05 0.1 0.15 pairwise F S T individuals pairwise

Future work: Variable kinship in GWAS

Suppose the kinship matrix ΦTi = (ϕT

ijk) varies per locus i :

Cov (xij , xik |T ) = 4pTi(1− pTi

)ϕTijk .

This ΦTi replaces the global kinship ΦT used in LMM and adjusted χ2 GWAS,

varying given local admixture or the recent mutation model.

Page 41: FST and kinship for arbitrary population structures2017/02/13  · C) European populations FIN FIN GBR GBR IBS IBS TSI TSI -0.05 0 0.05 0.1 0.15 pairwise F S T individuals pairwise

Future work: Variable kinship in heritability estimationSuppose the kinship matrix ΦT

i = (ϕTijk) varies per locus i :

Cov (xij , xik |T ) = 4pTi(1− pTi

)ϕTijk .

Let y = (yj) be a trait controlled by additive genetic effects as

yj = µ +∑i∈C

βixij + εj ,

The trait’s covariance structure is now given by the mean kinship at causal loci C :

Cov(y|T ) = σ2 (h22Φ̄T + (1− h2)I), where

Φ̄T =∑i∈C

wiΦTi , wi ∝ β2

i pTi (1− pTi ).