Mahdi&Saatchi,&IowaState&University& 6/2/17Mahdi&Saatchi,&IowaState&University& 6/2/17...
Transcript of Mahdi&Saatchi,&IowaState&University& 6/2/17Mahdi&Saatchi,&IowaState&University& 6/2/17...
Mahdi Saatchi, Iowa State University 6/2/17
2017 BIF Symposium, Athens, Ga. 1
Mahdi Saatchi
“Development of genomics pipeline for IGS BOLT geneKc evaluaKons”
1
BIF meeKng, Athens, GA June 2, 2017
Outlines:
2
! InternaKonal GeneKc SoluKons (IGS).
! IGS BOLT GeneKc EvaluaKons.
! Development of IGS Genomics Database (iGDB).
! QCs on Genotypes at iGDB
InternaKonal GeneKc SoluKons (IGS)
3
American Chianina Association (ACA)
American Gelbvieh Association (AGA)
American Maine Anjou Association (AMAA)
American Shorthorn Association (ASA)
American Simmental Association (ASA)
Canadian Simmental Association (CSA)
Canadian Angus Association (CAA)
Canadian Limousin Association (CLA)
Canadian Shorthorn Association (CSA)
Canadian Gelbvieh Association (CGA)
North American Limousin Foundation (NALF)
Red Angus Association of America (RAAA)
Current IGS geneKc evaluaKon model:
4
! A mulK-‐step blending approach:
• Molecular breeding values (MBV) are calculated separately
from the tradiKonal mulK-‐breed internaKonal caYle evaluaKon
(MB-‐ICE) and then combined.
GE-‐EPD = w1*MBV + w2*MB-‐ICE
Why Single-‐step GE-‐EPDs?
5
! GE-‐EPD are available only for genotyped animals while in the
single-‐step the DNA has impact on all the relaKves of the
genotyped animals.
! Improved accuracy and removed bias in esKmaKon of blending
parameters.
! Avoids the double counKng problem (high EPD animals turns to
get high MBV and vice versa)!
! We have powerful tools, such as BOLT, today!
SS-‐BR vs SS-‐BLUP:
6
! SS-‐BLUP is a breeding value model:
! SS-‐HM is a marker effect model:
Mahdi Saatchi, Iowa State University 6/2/17
2017 BIF Symposium, Athens, Ga. 2
BOLT Single-‐step Super Hybrid model:
7
The MME for Super Hybrid model:
CG Non-‐
Genotyped
Gen.
CG
EBV
ME
CG
Obs.n
Obs.g
Ang
Development of IGS Genomics DB (iGDB):
8
! A genomic data-‐flow pipeline is a need for the BOLT SS-‐SHM as
all the performance, pedigree and DNA informaKon needs to
be inserted to the BOLT geneKc evaluaKon simultaneously.
Challenges for developing iGDB:
9
! Genomic data were everywhere but not at IGS!
! Genomic data are in different marker densiKes (50K, LD, HD, …)
! Genomic data comes from different labs (GeneSeek, ZoeKs and
Delta Genomics) with different formats.
! InternaKonal and sample ID issues!!
Number of genotyped/pedigreed animals at iGDB:
10
BRD/CNT 161209 170123 170130 170206 170315 170419
AANUSA 2270 2270 2271 2269 2269 2269
BSHCAN 23 23 52 52 52 52
BSHUSA 908 908 995 994 1017 1037
CHAUSA 320 320 322 322 322 322
GVHCAN 1449 1156 1449 1447 1447 1447
GVHUSA 9659 9642 9680 9670 10241 10487
HERUSA 522 514 522 522 523 523
LIMCAN 0 532 823 821 821 821
LIMUSA 18 5113 5182 5167 5169 5169
RANUSA 13528 13528 13561 13547 17038 19014
RDPUSA 752 752 752 751 752 752
SIMCAN 18127 18329 18717 18650 18820 18888
SIMUSA 17390 17702 18303 18748 20220 20752
TOTAL 65129 70950 72793 73124 78855 81722
QCs applied at iGDB:
11
! QC on genotypes is more important for BOLT SS-‐SHM
(genotype quality extends to the whole pedigree).
! QCs on raw genotype call rates. ! Extreme homozygote genotypes.
! Parent-‐progeny miss-‐match.
! QCs on imputaKon.
iGDB: QCs on raw genotype call rates
12
! Remove animals with low call rate before pooling genotypes
(call rate < 0.85).
! Remove animals with low call rate aner pooling genotypes (call
rate < 0.05).
Mahdi Saatchi, Iowa State University 6/2/17
2017 BIF Symposium, Athens, Ga. 3
iGDB: QCs on extreme homozygote genotypes
13
! We observed some animals with extreme unusual homozygote
genotypes (AA or BB > 20%). An example: AA 1,199 AB 9,927 BB 41,382
! We found not many animals with such genotypes (only 11 so
far) that we removed them from iGDB.
iGDB: QCs on Parent-‐progeny miss-‐match
14
! We used all 50K markers to check parent-‐progeny genotype
agreement (similar to the parentage test).
Table 1 – From Megan Rolf, KSU, hYp://arKcles.extension.org/
iGDB: QCs on Parent-‐progeny miss-‐match
15
! Genotype dis-‐agreement > 2% -‐" miss-‐match.
iGDB: QCs on Parent-‐progeny miss-‐match
16
! Genotype dis-‐agreement > 2% -‐" miss-‐match.
ImputaKon:
• Is a method of determining some genotypes on a computer using actual genotypes on relaKves.
• It is a necessary process to combine genotypes with different densiKes before any geneKc evaluaKon.
17
Progeny
Sire
Dam
paternal maternal
paternal maternal
paternal maternal
ImputaKon:
18
Mahdi Saatchi, Iowa State University 6/2/17
2017 BIF Symposium, Athens, Ga. 4
Progeny
Sire
Dam
paternal maternal
paternal maternal
paternal maternal
ImputaKon:
19
Progeny
Sire
Dam
paternal maternal
paternal maternal
paternal maternal
ImputaKon:
We use FImpute sonware (Sargolzaei, M. et al.) for our imputaKon pipeline at iGDB.
20
iGDB: QCs on imputaKon (switch rate)
21
! We expect to see the same genotype status aner each
imputaKon (consistent genotypes).
! For some markers in some animals this is not true:
! AA switches to AB, or AB switched to BB …
Switch rates by breed associaKon data
22
Number of genotyped animals at IGS (as of 9/30/16):
23
50K 802 1593 3763 1941 1125 7939 17163
9K 186 6007 6193
BOS1 414 461 1602 2477
GGP-‐HD 172 1151 8 2187 4374 7892
GGP-‐UHD 6 567 48 569 4 3679 4873
HD 136 430 1278 226 544 2614
SupperLD 111 5206 1984 10699 17164 35164
ZeoKs 3500 3500
Total 1227 9547 7542 19122 1129 41309 79876
Switch rates by marker posiKon
24
Mahdi Saatchi, Iowa State University 6/2/17
2017 BIF Symposium, Athens, Ga. 5
Switch rates by breed
25
Summary:
26
! We have developed a genomics pipeline for the IGS BOLT
geneKc evaluaKons (iGDB).
! The are always challenges working with genotype data, mostly
the animal/sample IDs that which have to be resolved.
! The low quality genotypes have been detected and removed
form the geneKc evaluaKons. These animals need to be re-‐
genotyped if interested.
Summary:
27
! Parent-‐progeny genotypes dis-‐agreement is exist. We have
found some of these issues but needs further improvement
(idenKfy potenKal sire/grans sire …).
! We need to improve our imputaKon process (including
pedigree informaKon, pooled breed, …)
! We recommend breeders to use a higher density genotypes on
animals that don’t have any close relaKves with genotypes in
iGDB.
iGDB pipeline:
28
50K GGP-‐LD GGP-‐HD ZL5 …
Pooled Genotypes
Imputed Genotypes
BOLT
Raw genotypes
Pre-‐imputaKon QC
ImputaKon
QC’d Pooled Genotypes
Post-‐imputaKon QC
It takes less than a day to complete the process.
Acknowledment:
29
QuesKons?
• Iowa State University Hailin Su, potdoc. • Theta SoluKons, LLC Bruce Golden • IGS Lauren Hyde Steve McGuire Wade Shafer