IBD sharing: Theory and applications in the Ashkenazi Jewish population

Post on 24-Feb-2016

43 views 0 download

Tags:

description

IBD sharing: Theory and applications in the Ashkenazi Jewish population. Shai Carmi Pe’er lab, Columbia University. Mt. Sinai, NY March 2014. About Me. 2006-2008: Empirical network analysis (computational) 2007-2010: Diffusion/navigation in random networks (theory) - PowerPoint PPT Presentation

Transcript of IBD sharing: Theory and applications in the Ashkenazi Jewish population

IBD sharing: Theory and applications in the Ashkenazi Jewish population

Shai CarmiPe’er lab, Columbia University

Mt. Sinai, NYMarch 2014

About Me

• 2006-2008: Empirical network analysis (computational)

• 2007-2010: Diffusion/navigation in random networks (theory)

• 2010-2011: Anomalous diffusion (theory)

• 2008-2011: RNA splicing and editing (computational/experimental)

• 2012-2014: Population genetics, with Itsik Pe’er

Outline

• IBD Sharing: Introduction• Ashkenazi Jewish Genetics• Demographic inference• Imputation• Future Directions & Summary

Outline

• IBD Sharing: Introduction• Ashkenazi Jewish Genetics• Demographic inference• Imputation• Future Directions & Summary

Identical-by-Descent (IBD) Sharing

A B

AB

A shared segment

g

Definition: A segment is shared IBD if it is inherited from a single recent common ancestor.

What’s “recent”?

A B

AB

A shared segment

g

Definition: A segment is shared IBD if it is inherited from a single recent common ancestor.

• Textbook/Pedigrees:MRCA more recent than a given time (Thompson, Genetics, 2013)

• In practice:o A segment is IBD if

it is longer than a cutoff

o Allow small differences

o Present methods can detect segments > ≈1cM

When is the Common Ancestor “recent”?

N=10

g=7

Present

Time(generations)

Why is IBD Useful?

A BAB

A shared segment

g

• Segments are rare but longo Probability of a site to be shared o Segment length

Applications

A BAB

A shared segment

g

• A segment indicates recent co-ancestry:o Disease mappingo Pedigree reconstructiono Detecting natural selectiono Demographic (historical)

inference

• Identical sequence across individuals:o Phasingo Imputationo Estimating heritabilityo Estimating genotyping error

rateBrowning and Browning, Annu. Rev. Genet., 2012

IBD Sharing Theory

• Model:o A population with constant effective size No A minimal segment length mo Two chromosomes of length L

• The fraction of the chromosome in shared segments?

• The number of shared segments?

The IBD Process along the Chromosome

ℓ1

0 LCoordinate

ℓ2 ℓ3 ℓ4 ℓ5 ℓ6 ℓ7 ℓ8 ℓ9 ℓ10

𝑓 𝑇 = ( ℓ1+ℓ5+ℓ9 ¿¿ /𝐿;𝑛𝑇=3

t1

t2

t3

t4

t5

t6

t7

t8

t9

t10

cutoff mCoalescent theory:

Given :

Sample Results

• The avg. fraction of the chr. in shared segments:

;

• The avg. number of shared segments:

• Implicit expressions for the distributions

Palamara et al., AJHG, 2012; Carmi et. al., Genetics, 2013; Carmi and Pe’er, arXiv, 2014

Outline

• IBD Sharing: Introduction• Ashkenazi Jewish Genetics• Demographic inference• Imputation• Future Directions & Summary

Founder PopulationsTime

Founder population Non-founder population

Disease alleles

B

Population size

Founder Populations

Recent successes:• Greece (Tachmazidou et al., Nat. Comm. 2013)• Finland (Kurki et al. PLoS Genet., 2014)• Iceland (deCODE) (many papers; most recently Steinthorsdottir et al., Nat. Genet. 2014;

Grarup, PLoS Genet., 2013)

A Brief History of Ashkenazi Jews

• Unclear origin• Ca. 1000:

Small communities in Northern France, Rhineland• Migration east• Expansion• Migration to US and Israel• ≈10M today• Relative isolation

Ashkenazi Jewish (AJ) Genetics

Behar et al., Nature, 2010Bray et al., PNAS, 2010Guha et al., Genome Biol, 2012Behar et al., Hum. Biol., 2014

Price et al., PLoS Genet., 2008Olshen et al., BMC Genet, 2008Need et al., Genome Biol, 2009Kopelman et al., BMC Genet, 2009

Atzmon et al., AJHG, 2010

AJJewish, non-AJ

Middle-East

Europe

AJ Genetics: Interim Summary

• Current large population (≈10M)

• IBD analysis: bottleneck of effective size ≈300 (later)

• Mendelian disorders, high frequency risk alleles

• Insight on both European & Middle-Eastern past

• No genealogies

The Ashkenazi Genome Consortium

NY area labs interested in specific diseases

Quantify utility in medical genetics

Learn about population

history

Phase I: 128 whole genomes (CG; completed)Phase II: ≈300 whole genomes (NYGC; under way)

Large genotyped cohorts

Impute

Sequencing StatisticsStatistic Per genome

(exome)SNVs 3.4M (22k)

Novel SNVs 3.8% (4.1%)Het/hom ratio 1.65 (1.67)

Insertions 220k (242)Deletions 235k (223)

Multi-nucleotide variants 83k (374)Synonymous SNVs 10,536

Non-synonymous SNVs 9706Nonsense SNVs 72Other disrupting 255

CNVs 302SVs 1480MEIS 4090

Results Highlights• Low false positive rate at ≈5,000 per genome• 50% more novel variants per genome in AJ

(compared to non-Jewish Europeans)• More genetic diversity in AJ (θ), but less projected for large

samples• More AJ-specific variants compared to EU-specific variants• A model for EU-Middle-East-AJ ancient history• A model for AJ recent history• The panel is necessary for screening clinical AJ genomes• Catalog of mutations in known AJ disease genes• Slightly higher mutation burden in AJ• The panel is useful for imputation

S. C. et al., submitted

A Model for Ancient History

Outline

• IBD Sharing: Introduction• Ashkenazi Jewish Genetics• Demographic inference• Imputation• Future Directions & Summary

A Simple Approach

• Model: o A constant effective population size No A single chromosome of length Lo Sample size no For each pair, detect all segments of length >mo Compute <fT>, the average fraction of the chr.

shared• Inference:

o Method of moments

o Can prove:

Palamara et al., AJHG, 2012; Carmi et. al., Genetics, 2013

A Simple Approach

A Maximum Likelihood Approach

Carmi and Pe’er, arXiv, 2014

A Practical Approach

Palamara et al., AJHG, 2012

• Assume historical size N(t)=N0 λ(t).o Time scaled by 2N0

• Avg. fraction of the genome in segments of length ℓ1<ℓ<ℓ2:

(1)

Method:• Detect IBD in sample• Plot the empirical P(ℓ)• Using Eq. (1), find the

history N(t) that fits best

0 5 10 15 200.00

0.00

0.01

0.10

Ne 1000

Ne 2000

Ne 3000

Segment length ℓ

P(ℓ)

IBD Sharing in AJ

• Atzmon et al., AJHG, 2010

• Bray et al., PNAS, 2010

• Gusev et al., MBE, 2012

≈50cM per pair in segments >3cM

An AJ Bottleneck

S. C. et al., submitted

Time (years)

Caveats

• Phasing and genotyping errors; IBD detection errors• Reasonable power only for 10-50 generations ago• Model specification (e.g. prolonged bottleneck,

admixture)• Fitting

Parameter Ancestral size

Bottleneck size

Growth rate (per gen)

Bottleneck time (gen)

95% confidence interval

3654-5856

249-419 16-53% 25-32

Outline

• IBD Sharing: Introduction• Ashkenazi Jewish Genetics• Demographic inference• Imputation• Future Directions & Summary

Imputation

Impute2

• Cost-effective association study design:o Fully sequence a small reference

panelo Impute many sparsely genotyped

individuals

AJ Panel Performance

Fraction of non-ref variants with maf ≤1% wrongly imputed: 13% for AJ, 35% for CEU

Imputation by IBD

Sequence A

Gusev at al., Genetics, 2012

Imputation by IBD

Sequence A

• How to select individuals for sequencing?• Is there enough IBD sharing?• How to impute effectively?

Palin et al., Genet. Epidemiol., 2011; Kong et al., Nat. Genet., 2008

Selection for Sequencing

• Improve performance by selecting top-sharing samplesGusev et al., Genetics, 2012: INFOSTIP• Theory for coverage in a population model

Carmi et al., Genetics, 2013• Not terribly important

Coverage by IBD

Fit to:

TAGC (sequencing; n=128)SZ study (genotyping; n=2500)

Coverage by IBD: Theory

Time(gen)

Present

gg+1

𝑁→∞

𝑁→∞

B

𝑁→∞

1-α

Exact solution: Define and

Outline

• IBD Sharing: Introduction• Ashkenazi Jewish Genetics• Demographic inference• Imputation• Future Directions & Summary

Future Directions

• N-way IBD sharingo Derived P(ℓ1<ℓ<ℓ2) for three chromosomeso Important for demographic inference, disease

mapping, detecting natural selection

• Dating mutations using IBD

• Phasing/imputation using IBDo A fast approach needed

Dating f2 mutations

x x

Summary

• IBD is useful in genetics

• We characterized IBD in population models

• IBD abundant in AJ and can be used for historical inference and imputation

• Many interesting future applications

Acknowledgements

Funding:Human Frontiers Science program

Itsik Pe’er’s lab:James Xue, Ethan Kochav, Yunzhi Ye

TAGC consortium members:Todd Lencz, Semanti Mukherjee (LIJMC)Lorraine Clark, Xinmin Liu (CUMC)Gil Atzmon, Harry Ostrer, Danny Ben-Avraham (AECOM)Inga Peter, Judy Cho (MSSM) Joseph Vijai (MSKCC)Ken Hui (Yale)

Thank you for your attention!