Mahdi&Saatchi,&IowaState&University& 6/2/17Mahdi&Saatchi,&IowaState&University& 6/2/17...

5
Mahdi Saatchi, Iowa State University 6/2/17 2017 BIF Symposium, Athens, Ga. 1 Mahdi Saatchi “Development of genomics pipeline for IGS BOLT geneKc evaluaKons” 1 BIF meeKng, Athens, GA June 2, 2017 Outlines: 2 ! InternaKonal GeneKc SoluKons (IGS). ! IGS BOLT GeneKc EvaluaKons. ! Development of IGS Genomics Database (iGDB). ! QCs on Genotypes at iGDB InternaKonal GeneKc SoluKons (IGS) 3 American Chianina Association (ACA) American Gelbvieh Association (AGA) American Maine Anjou Association (AMAA) American Shorthorn Association (ASA) American Simmental Association (ASA) Canadian Simmental Association (CSA) Canadian Angus Association (CAA) Canadian Limousin Association (CLA) Canadian Shorthorn Association (CSA) Canadian Gelbvieh Association (CGA) North American Limousin Foundation (NALF) Red Angus Association of America (RAAA) Current IGS geneKc evaluaKon model: 4 ! A mulKstep blending approach: Molecular breeding values (MBV) are calculated separately from the tradiKonal mulKbreed internaKonal caYle evaluaKon (MBICE) and then combined. GEEPD = w1*MBV + w2*MBICE Why Singlestep GEEPDs? 5 ! GEEPD are available only for genotyped animals while in the singlestep the DNA has impact on all the relaKves of the genotyped animals. ! Improved accuracy and removed bias in esKmaKon of blending parameters. ! Avoids the double counKng problem (high EPD animals turns to get high MBV and vice versa)! ! We have powerful tools, such as BOLT, today! SSBR vs SSBLUP: 6 ! SSBLUP is a breeding value model: ! SSHM is a marker effect model:

Transcript of Mahdi&Saatchi,&IowaState&University& 6/2/17Mahdi&Saatchi,&IowaState&University& 6/2/17...

Page 1: Mahdi&Saatchi,&IowaState&University& 6/2/17Mahdi&Saatchi,&IowaState&University& 6/2/17 2017&BIF&Symposium,&Athens,&Ga.& 4 Progeny Sire Dam paternal maternal paternal maternal patern

Mahdi  Saatchi,  Iowa  State  University   6/2/17  

2017  BIF  Symposium,  Athens,  Ga.   1  

Mahdi  Saatchi  

“Development  of  genomics  pipeline  for  IGS  BOLT  geneKc  evaluaKons”

1  

BIF  meeKng,  Athens,  GA  June  2,  2017  

Outlines:  

2  

!   InternaKonal  GeneKc  SoluKons  (IGS).  

!   IGS  BOLT  GeneKc  EvaluaKons.      

!   Development  of  IGS  Genomics  Database  (iGDB).  

!  QCs  on  Genotypes  at  iGDB  

InternaKonal  GeneKc  SoluKons  (IGS)  

3  

American Chianina Association (ACA)

American Gelbvieh Association (AGA)

American Maine Anjou Association (AMAA)

American Shorthorn Association (ASA)

American Simmental Association (ASA)

Canadian Simmental Association (CSA)

Canadian Angus Association (CAA)

Canadian Limousin Association (CLA)

Canadian Shorthorn Association (CSA)

Canadian Gelbvieh Association (CGA)

North American Limousin Foundation (NALF)

Red Angus Association of America (RAAA)

Current  IGS  geneKc  evaluaKon  model:  

4  

!   A  mulK-­‐step  blending  approach:    

•   Molecular   breeding   values   (MBV)   are   calculated   separately  

from  the  tradiKonal  mulK-­‐breed  internaKonal  caYle  evaluaKon  

(MB-­‐ICE)  and  then  combined.    

GE-­‐EPD  =  w1*MBV  +  w2*MB-­‐ICE  

Why  Single-­‐step  GE-­‐EPDs?  

5  

!   GE-­‐EPD  are  available  only   for   genotyped  animals  while   in   the  

single-­‐step   the   DNA   has   impact   on   all   the   relaKves   of   the  

genotyped  animals.  

!   Improved  accuracy  and  removed  bias  in  esKmaKon  of  blending  

parameters.  

!   Avoids  the  double  counKng  problem  (high  EPD  animals  turns  to  

get  high  MBV  and  vice  versa)!    

!  We  have  powerful  tools,  such  as  BOLT,  today!  

SS-­‐BR  vs  SS-­‐BLUP:  

6  

!   SS-­‐BLUP  is  a  breeding  value  model:  

!   SS-­‐HM  is  a  marker  effect  model:  

Page 2: Mahdi&Saatchi,&IowaState&University& 6/2/17Mahdi&Saatchi,&IowaState&University& 6/2/17 2017&BIF&Symposium,&Athens,&Ga.& 4 Progeny Sire Dam paternal maternal paternal maternal patern

Mahdi  Saatchi,  Iowa  State  University   6/2/17  

2017  BIF  Symposium,  Athens,  Ga.   2  

BOLT  Single-­‐step  Super  Hybrid  model:  

7  

The  MME  for  Super  Hybrid  model:  

CG  Non-­‐

Genotyped  

Gen.  

CG  

EBV  

ME  

CG  

Obs.n  

Obs.g  

Ang  

Development  of  IGS  Genomics  DB  (iGDB):  

8  

!   A  genomic  data-­‐flow  pipeline  is  a  need  for  the  BOLT  SS-­‐SHM  as  

all   the  performance,  pedigree  and  DNA   informaKon  needs   to  

be  inserted  to  the  BOLT  geneKc  evaluaKon  simultaneously.  

Challenges  for  developing  iGDB:  

9  

!   Genomic  data  were  everywhere  but  not  at  IGS!  

!   Genomic  data  are  in  different  marker  densiKes  (50K,  LD,  HD,  …)  

!   Genomic  data  comes  from  different  labs  (GeneSeek,  ZoeKs  and  

Delta  Genomics)  with  different  formats.  

!   InternaKonal  and  sample  ID  issues!!  

Number  of  genotyped/pedigreed  animals  at  iGDB:  

10  

BRD/CNT   161209   170123   170130   170206   170315   170419  

AANUSA   2270   2270   2271   2269   2269   2269  

BSHCAN   23   23   52   52   52   52  

BSHUSA   908   908   995   994   1017   1037  

CHAUSA   320   320   322   322   322   322  

GVHCAN   1449   1156   1449   1447   1447   1447  

GVHUSA   9659   9642   9680   9670   10241   10487  

HERUSA   522   514   522   522   523   523  

LIMCAN   0   532   823   821   821   821  

LIMUSA   18   5113   5182   5167   5169   5169  

RANUSA   13528   13528   13561   13547   17038   19014  

RDPUSA   752   752   752   751   752   752  

SIMCAN   18127   18329   18717   18650   18820   18888  

SIMUSA   17390   17702   18303   18748   20220   20752  

TOTAL   65129   70950   72793   73124   78855   81722  

QCs  applied  at  iGDB:  

11  

!  QC   on   genotypes   is   more   important   for   BOLT   SS-­‐SHM  

(genotype  quality  extends  to  the  whole  pedigree).    

!  QCs  on  raw  genotype  call  rates.  !   Extreme  homozygote  genotypes.  

!   Parent-­‐progeny  miss-­‐match.  

!  QCs  on  imputaKon.    

iGDB:  QCs  on  raw  genotype  call  rates  

12  

!   Remove   animals   with   low   call   rate   before   pooling   genotypes  

(call  rate  <  0.85).  

!   Remove  animals  with  low  call  rate  aner  pooling  genotypes  (call  

rate  <  0.05).  

Page 3: Mahdi&Saatchi,&IowaState&University& 6/2/17Mahdi&Saatchi,&IowaState&University& 6/2/17 2017&BIF&Symposium,&Athens,&Ga.& 4 Progeny Sire Dam paternal maternal paternal maternal patern

Mahdi  Saatchi,  Iowa  State  University   6/2/17  

2017  BIF  Symposium,  Athens,  Ga.   3  

iGDB:  QCs  on  extreme  homozygote  genotypes  

13  

!  We  observed  some  animals  with  extreme  unusual  homozygote  

genotypes  (AA  or  BB  >  20%).  An  example:  AA  1,199  AB  9,927  BB  41,382  

!  We  found  not  many  animals  with  such  genotypes   (only  11  so  

far)  that  we  removed  them  from  iGDB.  

iGDB:  QCs  on  Parent-­‐progeny  miss-­‐match  

14  

!  We   used   all   50K   markers   to   check   parent-­‐progeny   genotype  

agreement  (similar  to  the  parentage  test).  

Table  1  –  From  Megan  Rolf,  KSU,  hYp://arKcles.extension.org/  

iGDB:  QCs  on  Parent-­‐progeny  miss-­‐match  

15  

!   Genotype  dis-­‐agreement  >  2%  -­‐"  miss-­‐match.  

iGDB:  QCs  on  Parent-­‐progeny  miss-­‐match  

16  

!   Genotype  dis-­‐agreement  >  2%  -­‐"  miss-­‐match.  

ImputaKon:  

•   Is  a  method  of  determining  some  genotypes  on  a  computer  using  actual  genotypes  on  relaKves.  

•   It  is  a  necessary  process  to  combine  genotypes  with  different  densiKes  before  any  geneKc  evaluaKon.  

17  

Progeny

Sire

Dam

paternal maternal

paternal maternal

paternal maternal

ImputaKon:  

18  

Page 4: Mahdi&Saatchi,&IowaState&University& 6/2/17Mahdi&Saatchi,&IowaState&University& 6/2/17 2017&BIF&Symposium,&Athens,&Ga.& 4 Progeny Sire Dam paternal maternal paternal maternal patern

Mahdi  Saatchi,  Iowa  State  University   6/2/17  

2017  BIF  Symposium,  Athens,  Ga.   4  

Progeny

Sire

Dam

paternal maternal

paternal maternal

paternal maternal

ImputaKon:  

19  

Progeny

Sire

Dam

paternal maternal

paternal maternal

paternal maternal

ImputaKon:  

We  use  FImpute  sonware  (Sargolzaei,  M.  et  al.)  for  our  imputaKon  pipeline  at  iGDB.    

20  

iGDB:  QCs  on  imputaKon  (switch  rate)  

21  

!  We   expect   to   see   the   same   genotype   status   aner   each  

imputaKon    (consistent  genotypes).  

!   For  some  markers  in  some  animals  this  is  not  true:  

!   AA  switches  to  AB,  or  AB  switched  to  BB  …  

Switch  rates  by  breed  associaKon  data  

22  

Number  of  genotyped  animals  at  IGS  (as  of  9/30/16):  

23  

50K 802   1593   3763   1941   1125   7939   17163  

9K     186               6007   6193  

BOS1     414   461           1602   2477  

GGP-­‐HD 172   1151   8   2187       4374   7892  

GGP-­‐UHD 6   567   48   569   4   3679   4873  

HD 136   430   1278   226       544   2614  

SupperLD 111   5206   1984   10699       17164   35164  

ZeoKs             3500           3500  

Total 1227   9547   7542   19122   1129   41309   79876  

Switch  rates  by  marker  posiKon  

24  

Page 5: Mahdi&Saatchi,&IowaState&University& 6/2/17Mahdi&Saatchi,&IowaState&University& 6/2/17 2017&BIF&Symposium,&Athens,&Ga.& 4 Progeny Sire Dam paternal maternal paternal maternal patern

Mahdi  Saatchi,  Iowa  State  University   6/2/17  

2017  BIF  Symposium,  Athens,  Ga.   5  

Switch  rates  by  breed  

25  

Summary:  

26  

!  We   have   developed   a   genomics   pipeline   for   the   IGS   BOLT  

geneKc  evaluaKons  (iGDB).  

!   The  are  always  challenges  working  with  genotype  data,  mostly  

the  animal/sample  IDs  that  which  have  to  be  resolved.  

!   The   low   quality   genotypes   have   been   detected   and   removed  

form   the   geneKc   evaluaKons.   These   animals   need   to   be   re-­‐

genotyped  if  interested.    

Summary:  

27  

!   Parent-­‐progeny   genotypes   dis-­‐agreement   is   exist.   We   have  

found   some   of   these   issues   but   needs   further   improvement  

(idenKfy  potenKal  sire/grans  sire  …).  

!  We   need   to   improve   our   imputaKon   process   (including  

pedigree  informaKon,  pooled  breed,  …)  

!  We  recommend  breeders  to  use  a  higher  density  genotypes  on  

animals   that  don’t  have  any  close  relaKves  with  genotypes   in  

iGDB.  

iGDB  pipeline:  

28  

50K   GGP-­‐LD   GGP-­‐HD   ZL5   …  

Pooled  Genotypes  

Imputed  Genotypes  

BOLT  

Raw  genotypes  

Pre-­‐imputaKon  QC  

ImputaKon  

QC’d  Pooled  Genotypes  

Post-­‐imputaKon  QC  

It  takes  less  than  a  day  to  complete  the  process.  

Acknowledment:

29  

QuesKons?

•   Iowa  State  University  Hailin  Su,  potdoc.    •   Theta  SoluKons,  LLC  Bruce  Golden    •   IGS  Lauren  Hyde  Steve  McGuire  Wade  Shafer