Extension of single-step ssGBLUP to many genotyped individuals · Genomic selection and single-step...
Transcript of Extension of single-step ssGBLUP to many genotyped individuals · Genomic selection and single-step...
![Page 1: Extension of single-step ssGBLUP to many genotyped individuals · Genomic selection and single-step • Simplicity – No DYD or DP – No index – No complexity • Accuracy –](https://reader036.fdocuments.in/reader036/viewer/2022070710/5ec602d6119be958c93ea4e3/html5/thumbnails/1.jpg)
Extension of single-step ssGBLUP to many genotyped individuals
Ignacy MisztalUniversity of Georgia
![Page 2: Extension of single-step ssGBLUP to many genotyped individuals · Genomic selection and single-step • Simplicity – No DYD or DP – No index – No complexity • Accuracy –](https://reader036.fdocuments.in/reader036/viewer/2022070710/5ec602d6119be958c93ea4e3/html5/thumbnails/2.jpg)
Genomic selection and single-step
•
Simplicity–
No DYD or DP–
No index –
No complexity
•
Accuracy–
Avoids double counting–
Avoids fixed index–
Accounts for preselection bias
H-1 = A-1 + 0 00 G-1 − A22
-1
Aguilar et al., 2010
Christensen and Lund, 2010
![Page 3: Extension of single-step ssGBLUP to many genotyped individuals · Genomic selection and single-step • Simplicity – No DYD or DP – No index – No complexity • Accuracy –](https://reader036.fdocuments.in/reader036/viewer/2022070710/5ec602d6119be958c93ea4e3/html5/thumbnails/3.jpg)
Current implementation of SS
•
G and A22
created explicitly •
Quadratic memory and cubic computations•
Cost per 100k genotypes -
1.5 hr (Aguilar et al.,2014)
−
-1 -1
-1 -122
H = A +0 00 G A
![Page 4: Extension of single-step ssGBLUP to many genotyped individuals · Genomic selection and single-step • Simplicity – No DYD or DP – No index – No complexity • Accuracy –](https://reader036.fdocuments.in/reader036/viewer/2022070710/5ec602d6119be958c93ea4e3/html5/thumbnails/4.jpg)
Number of genotypes and impending problem
> 2 M for Holsteins> 400k for Angus
Genomic pre-selection issue (Patry and Ducrocq, 2011; VanRaden et al., 2013)
–
BLUP increasingly biased–
Need all data on preselection included
![Page 5: Extension of single-step ssGBLUP to many genotyped individuals · Genomic selection and single-step • Simplicity – No DYD or DP – No index – No complexity • Accuracy –](https://reader036.fdocuments.in/reader036/viewer/2022070710/5ec602d6119be958c93ea4e3/html5/thumbnails/5.jpg)
Unsymmetric equations
Misztal et al., 2009
No convergence without good preconditionerNo convergence with large H or A
![Page 6: Extension of single-step ssGBLUP to many genotyped individuals · Genomic selection and single-step • Simplicity – No DYD or DP – No index – No complexity • Accuracy –](https://reader036.fdocuments.in/reader036/viewer/2022070710/5ec602d6119be958c93ea4e3/html5/thumbnails/6.jpg)
No G or A22
inverse model
Legarra and Ducrocq (2011)
Slow convergence with few genotypesDivergence with many genotypes
![Page 7: Extension of single-step ssGBLUP to many genotyped individuals · Genomic selection and single-step • Simplicity – No DYD or DP – No index – No complexity • Accuracy –](https://reader036.fdocuments.in/reader036/viewer/2022070710/5ec602d6119be958c93ea4e3/html5/thumbnails/7.jpg)
SNP model for genotyped animals
Legarra and Ducrocq, 2011
No successful programming
![Page 8: Extension of single-step ssGBLUP to many genotyped individuals · Genomic selection and single-step • Simplicity – No DYD or DP – No index – No complexity • Accuracy –](https://reader036.fdocuments.in/reader036/viewer/2022070710/5ec602d6119be958c93ea4e3/html5/thumbnails/8.jpg)
SNP model for genotyped animals
Liu et al, 2014
![Page 9: Extension of single-step ssGBLUP to many genotyped individuals · Genomic selection and single-step • Simplicity – No DYD or DP – No index – No complexity • Accuracy –](https://reader036.fdocuments.in/reader036/viewer/2022070710/5ec602d6119be958c93ea4e3/html5/thumbnails/9.jpg)
SNP effects for all animals (Fernando et al., 2014)
centered genotypes
imputed genotypes
Cost of imputationRequires new type of programmingExtension to complex models unclear
![Page 10: Extension of single-step ssGBLUP to many genotyped individuals · Genomic selection and single-step • Simplicity – No DYD or DP – No index – No complexity • Accuracy –](https://reader036.fdocuments.in/reader036/viewer/2022070710/5ec602d6119be958c93ea4e3/html5/thumbnails/10.jpg)
Can regular ssGBLUP be made more efficient?
![Page 11: Extension of single-step ssGBLUP to many genotyped individuals · Genomic selection and single-step • Simplicity – No DYD or DP – No index – No complexity • Accuracy –](https://reader036.fdocuments.in/reader036/viewer/2022070710/5ec602d6119be958c93ea4e3/html5/thumbnails/11.jpg)
Scaling up A22-1
![Page 12: Extension of single-step ssGBLUP to many genotyped individuals · Genomic selection and single-step • Simplicity – No DYD or DP – No index – No complexity • Accuracy –](https://reader036.fdocuments.in/reader036/viewer/2022070710/5ec602d6119be958c93ea4e3/html5/thumbnails/12.jpg)
Is dimensionality of genomic information limited?
•
Regular G not positive definite past ~5k –
Blending with A (VanRaden, 2008)
•
Dimensionality of SNP BLUP small (Maciotta et al., 2013)
•
Success of imputation
•
Manhattan plots noisy until averaged by 300k-10Mb (depending on species)
![Page 13: Extension of single-step ssGBLUP to many genotyped individuals · Genomic selection and single-step • Simplicity – No DYD or DP – No index – No complexity • Accuracy –](https://reader036.fdocuments.in/reader036/viewer/2022070710/5ec602d6119be958c93ea4e3/html5/thumbnails/13.jpg)
![Page 14: Extension of single-step ssGBLUP to many genotyped individuals · Genomic selection and single-step • Simplicity – No DYD or DP – No index – No complexity • Accuracy –](https://reader036.fdocuments.in/reader036/viewer/2022070710/5ec602d6119be958c93ea4e3/html5/thumbnails/14.jpg)
…………
Heterogenetic and homogenic tracts in genome (Stam, 1980)
E(#tracts)=4NeL (Stam, 1980)Ne –
effective population sizeL –length of genome in Morgans
Holsteins: Ne ≈100 L=30 Me=12,000
![Page 15: Extension of single-step ssGBLUP to many genotyped individuals · Genomic selection and single-step • Simplicity – No DYD or DP – No index – No complexity • Accuracy –](https://reader036.fdocuments.in/reader036/viewer/2022070710/5ec602d6119be958c93ea4e3/html5/thumbnails/15.jpg)
Inversion via SVD/eigenvalue decompositionAssume 1 million animals genotyped with 60k chip
![Page 16: Extension of single-step ssGBLUP to many genotyped individuals · Genomic selection and single-step • Simplicity – No DYD or DP – No index – No complexity • Accuracy –](https://reader036.fdocuments.in/reader036/viewer/2022070710/5ec602d6119be958c93ea4e3/html5/thumbnails/16.jpg)
Inverse by Woodbury formula
Ostersen et al., 2017
Mantysaari et al., 2017
![Page 17: Extension of single-step ssGBLUP to many genotyped individuals · Genomic selection and single-step • Simplicity – No DYD or DP – No index – No complexity • Accuracy –](https://reader036.fdocuments.in/reader036/viewer/2022070710/5ec602d6119be958c93ea4e3/html5/thumbnails/17.jpg)
If G has limited dimensionality, can G-1
be sparse like A-1?
![Page 18: Extension of single-step ssGBLUP to many genotyped individuals · Genomic selection and single-step • Simplicity – No DYD or DP – No index – No complexity • Accuracy –](https://reader036.fdocuments.in/reader036/viewer/2022070710/5ec602d6119be958c93ea4e3/html5/thumbnails/18.jpg)
Use of a la Henderson’s rules?
Use of relatives for G-1
Accuracies not good enoughTheory not clear
![Page 19: Extension of single-step ssGBLUP to many genotyped individuals · Genomic selection and single-step • Simplicity – No DYD or DP – No index – No complexity • Accuracy –](https://reader036.fdocuments.in/reader036/viewer/2022070710/5ec602d6119be958c93ea4e3/html5/thumbnails/19.jpg)
Assumption of limited dimensionality
= +u Ts eBreeding value
s
–
n x 1 vector containing additive information ofpopulation (haplotypes, chromosome segments, LD blocks)?
1 c c−≈s T uIf uc
contains n animals:
Very small error
Breeding values of any n animals contains all additive information
![Page 20: Extension of single-step ssGBLUP to many genotyped individuals · Genomic selection and single-step • Simplicity – No DYD or DP – No index – No complexity • Accuracy –](https://reader036.fdocuments.in/reader036/viewer/2022070710/5ec602d6119be958c93ea4e3/html5/thumbnails/20.jpg)
=
c c
ncn n
I 0u uP Iu ε
= 0 cncc
nc nn
I 0 I PG 0G
P I 0 IM
−−
−
− = −
11 1cccn
ncnn
I 0G 0I PG
P I0 M0 I
Choose core “c”
and noncore “n”
animals
ε= +n nc c nu P u=c cu u
=var( )n nnε M
![Page 21: Extension of single-step ssGBLUP to many genotyped individuals · Genomic selection and single-step • Simplicity – No DYD or DP – No index – No complexity • Accuracy –](https://reader036.fdocuments.in/reader036/viewer/2022070710/5ec602d6119be958c93ea4e3/html5/thumbnails/21.jpg)
How to estimate P
and inv(G)?2var c cc cnu
n nc nn
σ
=
u G Gu G G
cc cc cnnc cc
-1 -1-1 -1 -1G 0 G GG = + M G 'G I
0 0 I
1 1| ,n c nc cc c nc cc− −= =u u G G u P G G
G is “true”
relationship matrix
APY algorithm(Algorithm for Proven and Young)
![Page 22: Extension of single-step ssGBLUP to many genotyped individuals · Genomic selection and single-step • Simplicity – No DYD or DP – No index – No complexity • Accuracy –](https://reader036.fdocuments.in/reader036/viewer/2022070710/5ec602d6119be958c93ea4e3/html5/thumbnails/22.jpg)
Properties of APY algorithm
Cost: Almost linear memory and computations
G G-1
G
APY G-1
Cost: Quadratic memory and cubic computations
![Page 23: Extension of single-step ssGBLUP to many genotyped individuals · Genomic selection and single-step • Simplicity – No DYD or DP – No index – No complexity • Accuracy –](https://reader036.fdocuments.in/reader036/viewer/2022070710/5ec602d6119be958c93ea4e3/html5/thumbnails/23.jpg)
EAAP meeting 2016
![Page 24: Extension of single-step ssGBLUP to many genotyped individuals · Genomic selection and single-step • Simplicity – No DYD or DP – No index – No complexity • Accuracy –](https://reader036.fdocuments.in/reader036/viewer/2022070710/5ec602d6119be958c93ea4e3/html5/thumbnails/24.jpg)
Reliabilities –
Holsteins (77k)
Final score
regular G-1
4.5k 8k 14k 19k 77kNeL 2NeL 4NeL
Pocrnic et al., 2016b
![Page 25: Extension of single-step ssGBLUP to many genotyped individuals · Genomic selection and single-step • Simplicity – No DYD or DP – No index – No complexity • Accuracy –](https://reader036.fdocuments.in/reader036/viewer/2022070710/5ec602d6119be958c93ea4e3/html5/thumbnails/25.jpg)
Distribution of segments/haplotypes/..
Assumed dimensionality
Relia
bilit
y
100%
≈4NeL
![Page 26: Extension of single-step ssGBLUP to many genotyped individuals · Genomic selection and single-step • Simplicity – No DYD or DP – No index – No complexity • Accuracy –](https://reader036.fdocuments.in/reader036/viewer/2022070710/5ec602d6119be958c93ea4e3/html5/thumbnails/26.jpg)
Costs with 720k genotyped animals
•
30 M Holsteins•
50 M records•
764k 60k genotypes
Item BLUP ssGBLUPAPY G - 7 hA22-1 - 10 minrounds 402 464Time/round 51 s 83 sTotal time 6 h 17 h
Masuda et al., 2017
![Page 27: Extension of single-step ssGBLUP to many genotyped individuals · Genomic selection and single-step • Simplicity – No DYD or DP – No index – No complexity • Accuracy –](https://reader036.fdocuments.in/reader036/viewer/2022070710/5ec602d6119be958c93ea4e3/html5/thumbnails/27.jpg)
Which core animals in APY?Bradford et al. (2017)
Simulated populations (QMSim; Sargolzaei and Schenkel, 2009)
Ne = 40
#genotyped animals = 50,000
Core animals:
Random gen 6 || gen 7 || gen8 || gen9 || gen 10 (y)
Random all generations30
![Page 28: Extension of single-step ssGBLUP to many genotyped individuals · Genomic selection and single-step • Simplicity – No DYD or DP – No index – No complexity • Accuracy –](https://reader036.fdocuments.in/reader036/viewer/2022070710/5ec602d6119be958c93ea4e3/html5/thumbnails/28.jpg)
Which core animals in APY?
G-1
31
Bradford et al. (2016)
4NeL NeL
![Page 29: Extension of single-step ssGBLUP to many genotyped individuals · Genomic selection and single-step • Simplicity – No DYD or DP – No index – No complexity • Accuracy –](https://reader036.fdocuments.in/reader036/viewer/2022070710/5ec602d6119be958c93ea4e3/html5/thumbnails/29.jpg)
Persistence over generations
Generations
Relia
bilit
y
1.0
BLUP
GBLUP –
small population
BayesB –
small population
GBLUP –
very large population
GBLUP –
very large population90% genome
Very large –
equivalent to 4NeL animals with 99% accuracyAre SNP effects from Holstein national populations converging
![Page 30: Extension of single-step ssGBLUP to many genotyped individuals · Genomic selection and single-step • Simplicity – No DYD or DP – No index – No complexity • Accuracy –](https://reader036.fdocuments.in/reader036/viewer/2022070710/5ec602d6119be958c93ea4e3/html5/thumbnails/30.jpg)
Theory of limited dimensionalityNumber of haplotypes: 4 Ne LNe within each ¼
Morgan segment
¼
Morgan
Ne haplotypes
QTLsGenome haplotypes
Dimensionality of ¼
Morgan case: Ne
Fragomeni et al., 2018
Ne haplotypes
Ne haplotypes
or number of identified QTLsReduced dimensionality with weighted GRM
![Page 31: Extension of single-step ssGBLUP to many genotyped individuals · Genomic selection and single-step • Simplicity – No DYD or DP – No index – No complexity • Accuracy –](https://reader036.fdocuments.in/reader036/viewer/2022070710/5ec602d6119be958c93ea4e3/html5/thumbnails/31.jpg)
ssGBLUP accuracies using SNP60K and 100 QTNs –
simulation study
Fragomeni et al. (2017)
Rank
19k
5k
98
![Page 32: Extension of single-step ssGBLUP to many genotyped individuals · Genomic selection and single-step • Simplicity – No DYD or DP – No index – No complexity • Accuracy –](https://reader036.fdocuments.in/reader036/viewer/2022070710/5ec602d6119be958c93ea4e3/html5/thumbnails/32.jpg)
Multitrait ssGBLUP or SNP selection?
•
SNP selection/weighting (BayesB, etc.) –
Large impact with few genotypes–
Little or no impact with many
![Page 33: Extension of single-step ssGBLUP to many genotyped individuals · Genomic selection and single-step • Simplicity – No DYD or DP – No index – No complexity • Accuracy –](https://reader036.fdocuments.in/reader036/viewer/2022070710/5ec602d6119be958c93ea4e3/html5/thumbnails/33.jpg)
Variance components
•
Based on SNP–
limitations
•
REML based on relationships–
Equations no longer sparse–
YAMS sparse matrix package –up to 100 times speedup (Masuda et al., 2017)
–
APY for REML
•
Method R (Legarra and Reverter, 2017)
![Page 34: Extension of single-step ssGBLUP to many genotyped individuals · Genomic selection and single-step • Simplicity – No DYD or DP – No index – No complexity • Accuracy –](https://reader036.fdocuments.in/reader036/viewer/2022070710/5ec602d6119be958c93ea4e3/html5/thumbnails/34.jpg)
Extra topics
•
Matching pedigrees and genomic relationships•
Missing pedigrees•
Crossbreeding•
Causative SNP
•
Haplotypes for crossbreds (Christensen et al., 2016)•
Metafounders (Legarra et al., 2016)•
Approximation of reliabilities
![Page 35: Extension of single-step ssGBLUP to many genotyped individuals · Genomic selection and single-step • Simplicity – No DYD or DP – No index – No complexity • Accuracy –](https://reader036.fdocuments.in/reader036/viewer/2022070710/5ec602d6119be958c93ea4e3/html5/thumbnails/35.jpg)
Conclusions
•
Limited dimensionality of genomic information due to limited effective population size
•
ssGBLUP suitable for any data set and model
•
With large data sets for Holsteins:–
Good persistence of predictions–
Convergence of predictions from different countries
![Page 36: Extension of single-step ssGBLUP to many genotyped individuals · Genomic selection and single-step • Simplicity – No DYD or DP – No index – No complexity • Accuracy –](https://reader036.fdocuments.in/reader036/viewer/2022070710/5ec602d6119be958c93ea4e3/html5/thumbnails/36.jpg)
Acknowledgements
Yutaka Masuda
Shogo Tsuruta
Daniela Lourenco
Breno Fragomeni
Ignacio Aguilar
Andres Legarra
Ivan Pocrnic
Heather Bradford
Tom Lawlor, Holstein AssocPaul VanRaden, AGIL USDA
![Page 37: Extension of single-step ssGBLUP to many genotyped individuals · Genomic selection and single-step • Simplicity – No DYD or DP – No index – No complexity • Accuracy –](https://reader036.fdocuments.in/reader036/viewer/2022070710/5ec602d6119be958c93ea4e3/html5/thumbnails/37.jpg)
Theory for APY
•
Breeding values of core animals linear functions of:–
Independent chromosome segments (Me)–
Independent effective SNP
•
E(Me)=4 Ne L (Stam, 1980; VanRaden, 2008)Ne –effective population sizeL –
length of genome in Morgans
Me = 4 (Ne=100) (L=30) =12,000
![Page 38: Extension of single-step ssGBLUP to many genotyped individuals · Genomic selection and single-step • Simplicity – No DYD or DP – No index – No complexity • Accuracy –](https://reader036.fdocuments.in/reader036/viewer/2022070710/5ec602d6119be958c93ea4e3/html5/thumbnails/38.jpg)
QTL
Accuracy and distance from markers to QTL
46
Fragomeni
et al. (2017)