Population Approaches to Detecting and Genotyping Copy Number Variation
description
Transcript of Population Approaches to Detecting and Genotyping Copy Number Variation
![Page 1: Population Approaches to Detecting and Genotyping Copy Number Variation](https://reader035.fdocuments.in/reader035/viewer/2022062409/568150ba550346895dbed4b0/html5/thumbnails/1.jpg)
Population Approaches to Detecting and Genotyping Copy
Number Variation
Lachlan Coin
July 2010
![Page 2: Population Approaches to Detecting and Genotyping Copy Number Variation](https://reader035.fdocuments.in/reader035/viewer/2022062409/568150ba550346895dbed4b0/html5/thumbnails/2.jpg)
Outline
• Population-haplotype approach to CNV detecting and genotyping
• Application to SNP and CGH data
• Application to NGS sequence data
![Page 3: Population Approaches to Detecting and Genotyping Copy Number Variation](https://reader035.fdocuments.in/reader035/viewer/2022062409/568150ba550346895dbed4b0/html5/thumbnails/3.jpg)
cnvHap approach to CNV discovery and genotyping
Coin et al, 2010, Nature Methods 7, 541 - 546 (2010)
![Page 4: Population Approaches to Detecting and Genotyping Copy Number Variation](https://reader035.fdocuments.in/reader035/viewer/2022062409/568150ba550346895dbed4b0/html5/thumbnails/4.jpg)
Example of trained model
![Page 5: Population Approaches to Detecting and Genotyping Copy Number Variation](https://reader035.fdocuments.in/reader035/viewer/2022062409/568150ba550346895dbed4b0/html5/thumbnails/5.jpg)
cnvHap models haploid CN transitions
• Specify an per-base global transition rate matrix
copy number to
copy
num
ber
from 0
1234
0 1 2 3 4
q00 q10 ….
…
• Rate matrix multiplied by position specific scalar rate• Values trained using EM, following the approach of
Klosterman et al, used in Xrate for finding substitution rates
![Page 6: Population Approaches to Detecting and Genotyping Copy Number Variation](https://reader035.fdocuments.in/reader035/viewer/2022062409/568150ba550346895dbed4b0/html5/thumbnails/6.jpg)
cnvHap joint model of CNV + SNP haplotypes
![Page 7: Population Approaches to Detecting and Genotyping Copy Number Variation](https://reader035.fdocuments.in/reader035/viewer/2022062409/568150ba550346895dbed4b0/html5/thumbnails/7.jpg)
Cluster positions modelled using a linear model
1))((*0.5))((*)(=)(
))((1*)(=)(
)(=)(
)/2))((log(=)(
)/2)((log=)(
1=)(
*=
)(
)(
)(
)(
5
4
3
22
1
0
2
2
ggggf
gggf
ggf
ggf
ggf
gf
g
g
g
g
bm
bm
rm
rm
bfracbfracbfrac
bfracbfrac
bfrac
CN
CN
β
Model fitted using Ridge regression carried at each iteration of E-M algorithm
![Page 8: Population Approaches to Detecting and Genotyping Copy Number Variation](https://reader035.fdocuments.in/reader035/viewer/2022062409/568150ba550346895dbed4b0/html5/thumbnails/8.jpg)
Using Illumina SNP arrays
![Page 9: Population Approaches to Detecting and Genotyping Copy Number Variation](https://reader035.fdocuments.in/reader035/viewer/2022062409/568150ba550346895dbed4b0/html5/thumbnails/9.jpg)
Illumina Agilent Illumina Agilent Illumina Agilent
Combined Illumina and Agilent arrays
![Page 10: Population Approaches to Detecting and Genotyping Copy Number Variation](https://reader035.fdocuments.in/reader035/viewer/2022062409/568150ba550346895dbed4b0/html5/thumbnails/10.jpg)
Some CNVs exhibit shared structure
![Page 11: Population Approaches to Detecting and Genotyping Copy Number Variation](https://reader035.fdocuments.in/reader035/viewer/2022062409/568150ba550346895dbed4b0/html5/thumbnails/11.jpg)
Improved CNV genotyping accuracy
Cumulative Frequency of Squared Pearson Correlation
![Page 12: Population Approaches to Detecting and Genotyping Copy Number Variation](https://reader035.fdocuments.in/reader035/viewer/2022062409/568150ba550346895dbed4b0/html5/thumbnails/12.jpg)
A deletion at 16p11.2 in a patient with ‘extreme obesity’
• estimated by aCGH to be 546kb-700kb• flanked by segmental duplication (>99% sequence identity)• probably arises by NAHR, implying deletion is 739kb
• BMI = 29.2 kg.m-2 at age 7½• learning difficulties, delayed speech
28.9 Mb 29.2 Mb 29.5 Mb 29.8 Mb 30.1 Mb 30.4 Mb 30.7 Mb
p13.2
p13.1
2
p12.3
p12.1
q12.2
q21
q22.2
q23.1
q23.3
q24.2
p11.2
log2
ratio
+1
0
- 1
- 2
- 3
MLPA probes
Segmental duplication
chromosome 16
RG Walters et al. Nature 463, 671-675 (2010) doi:10.1038/nature08727
![Page 13: Population Approaches to Detecting and Genotyping Copy Number Variation](https://reader035.fdocuments.in/reader035/viewer/2022062409/568150ba550346895dbed4b0/html5/thumbnails/13.jpg)
16p11.2 deletions in obesity and population cohorts
-3/931British extreme early-onset obesity (SCOOP)
0/5304/643French child obesity case:control
Lean/Normal Weight
ObeseCohort
0/6694/705French adult obesity case:control
1/62353/1592Population cohorts(NFBC1966, CoLaus, EGPUT)
0/1402/159Swedish discordant siblings
-2/141French bariatric surgery patients
Obesity: P = 5.8x10-7 OR = 29.8 [3.9–225]Morbid obesity: P = 6.4x10-8 OR = 43.0 [5.6–329]
![Page 14: Population Approaches to Detecting and Genotyping Copy Number Variation](https://reader035.fdocuments.in/reader035/viewer/2022062409/568150ba550346895dbed4b0/html5/thumbnails/14.jpg)
Coverage affected by GC content
![Page 15: Population Approaches to Detecting and Genotyping Copy Number Variation](https://reader035.fdocuments.in/reader035/viewer/2022062409/568150ba550346895dbed4b0/html5/thumbnails/15.jpg)
Regression model fit to correct for GC bias
![Page 16: Population Approaches to Detecting and Genotyping Copy Number Variation](https://reader035.fdocuments.in/reader035/viewer/2022062409/568150ba550346895dbed4b0/html5/thumbnails/16.jpg)
Loess curves fit to remove residual spatial variation of coverage
![Page 17: Population Approaches to Detecting and Genotyping Copy Number Variation](https://reader035.fdocuments.in/reader035/viewer/2022062409/568150ba550346895dbed4b0/html5/thumbnails/17.jpg)
Detecting CNVS with NGS dataDepth/haploid coverage
B-allele frequency
![Page 18: Population Approaches to Detecting and Genotyping Copy Number Variation](https://reader035.fdocuments.in/reader035/viewer/2022062409/568150ba550346895dbed4b0/html5/thumbnails/18.jpg)
NGS versus CGH data
NGS data chrom1:350mb-351mb CGH data chrom1:350mb-351mb
![Page 19: Population Approaches to Detecting and Genotyping Copy Number Variation](https://reader035.fdocuments.in/reader035/viewer/2022062409/568150ba550346895dbed4b0/html5/thumbnails/19.jpg)
NGS vs CGH data
![Page 20: Population Approaches to Detecting and Genotyping Copy Number Variation](https://reader035.fdocuments.in/reader035/viewer/2022062409/568150ba550346895dbed4b0/html5/thumbnails/20.jpg)
Haplotype structure of deletion
![Page 21: Population Approaches to Detecting and Genotyping Copy Number Variation](https://reader035.fdocuments.in/reader035/viewer/2022062409/568150ba550346895dbed4b0/html5/thumbnails/21.jpg)
NGS amplification Depth/coverage
![Page 22: Population Approaches to Detecting and Genotyping Copy Number Variation](https://reader035.fdocuments.in/reader035/viewer/2022062409/568150ba550346895dbed4b0/html5/thumbnails/22.jpg)
With consistent break-points in population
![Page 23: Population Approaches to Detecting and Genotyping Copy Number Variation](https://reader035.fdocuments.in/reader035/viewer/2022062409/568150ba550346895dbed4b0/html5/thumbnails/23.jpg)
Polyploid phasing and imputationIm
puta
tion
erro
r ra
teS
witc
h e
rror
rat
e
![Page 24: Population Approaches to Detecting and Genotyping Copy Number Variation](https://reader035.fdocuments.in/reader035/viewer/2022062409/568150ba550346895dbed4b0/html5/thumbnails/24.jpg)
Conclusions
• Population-haplotype model enables joint CNV discovery and genotyping using array data
• Preliminary results indicate this will also help using NGS data
• Combining information from multiple platforms improves sensitivity
• Imputation still works for ploidy > 2, phasing becomes more difficult
![Page 25: Population Approaches to Detecting and Genotyping Copy Number Variation](https://reader035.fdocuments.in/reader035/viewer/2022062409/568150ba550346895dbed4b0/html5/thumbnails/25.jpg)
Acknowledgements
Evangelos Bellos
Shu-Yi Su
Robin Walters
Julian Asher
Alex Blakemore
Adam de Smith
Phillipe Froguel
Julia El-Sayed Moustafa
David Balding (UCL)
Rob Sladek (McGill)