Animal Breeding & Genomics Centre
Illumina Agriculture Seminar SeriesAugust 3, 2009, Madison WI
Martien Groenen
High throughput SNP discovery in the pig using the Illumina Genome Analyzer and characterization of the porcine HapMap
panel using the Illumina Porcine 60K iSelect TM Beadchip
Animal Breeding & Genomics Centre
Illumina Agriculture Seminar SeriesAugust 3, 2009, Madison WI
Overview
SNP discovery and chip design
Statistics of the 60 K chip
Hapmap data: MDS and clustering
Hapmap data: Haploype diversity, haplotype
sharing and selective sweeps
Fom SNP genotyping towards resequencing
Animal Breeding & Genomics Centre
Illumina Agriculture Seminar SeriesAugust 3, 2009, Madison WI
HistoryObjective
Design of a 60K Illumina BeadChip
ChallengeLimited number of SNPs availableOnly half of porcine genome sequence availableJob had to be done between April/August 2008
ApproachSNP identification by sequencing reduced representation libraries (RRL) on an Illumina Genome Analyzer
Animal Breeding & Genomics Centre
Illumina Agriculture Seminar SeriesAugust 3, 2009, Madison WI
Overview sequencing approach
6x 6x 6x 6x
30x Highly reliable SNP identification
Rough estimate of MAF within breeds
5 Reduced representation libraries (RRL)Restriction digest of DNASeparate on acrylamide gelsIsolate 150-200 bp fragments
6x
Animal Breeding & Genomics Centre
RRL Genome%
HaeIII 160-200 1
AluI 100-150 2
AluI 150-200 3
MspI 100-200 1
DraI 180-260 2
TOTAL 9
Total No Reads(million)
67.1
88.6
145.9
69.1
100.3
481
No Reads after filtering (million)
57.8
83.4
95.8
52.3
67.7
357
RRLs results
=12.8 billion bp
Animal Breeding & Genomics Centre
Illumina Agriculture Seminar SeriesAugust 3, 2009, Madison WI
Align to reference genome and identify SNPs using MAQ
C/T G/C
HaeIII HaeIIISNPs
SNP called if minor allele seen at least 3x
Animal Breeding & Genomics Centre
SNP distribution on Illumina GA reads
All
Transitions
Transversions
Animal Breeding & Genomics Centre
All porcine SNPs currently availableSNP source Number
HQ Solexa SNPs WU* 333,000
454 sequencing MARC 110,000
7K iSelect chip AU-RI 5,500
dbSNP 17,700
INRA (Sanger seq) 61,700
Cambridge University 14,900
Total No SNPs 543,000
Total Unique SNPs 510,000
TOTAL #SNPs submitted for the chip: 72,000
70 % mapped on build 77 % predicted23 % unmapped
*Plus additional 58,000 SNPs where minor allele seen twice
Animal Breeding & Genomics Centre
0
1000
2000
3000
4000
5000
6000
7000
8000
9000
10000
5000
1000
0
1500
0
2000
0
2500
0
3000
0
3500
0
4000
0
4500
0
5000
0
5500
0
6000
0
6500
0
7000
0
7500
0
8000
0
8500
0
9000
0
9500
0
1000
00
1050
00
1100
00
1150
00
1200
00
1250
00
1300
00
1350
00
1400
00
1450
00
1500
00
>150
000
SNP distributionBuild 7
Build 9(89 % of SNPs)
Animal Breeding & Genomics Centre
Porcine 60K BeadChip: Excellent performanceFinal total number of SNPs on the chip: 62,163Number of SNPs with MAF>0.05 in at least 1 breed: 59KConversion rate overall: 94.8% (Illumina GA SNPs: 96 %)Good correlation between allele count based on GA sequencing and Geotyping
Animal Breeding & Genomics Centre
The porcine HapMap project
576 individuals typed by Illumina991 individuals typed by WU
Animal Breeding & Genomics Centre
Phenotypic variation
Animal Breeding & Genomics Centre
Illumina Agriculture Seminar SeriesAugust 3, 2009, Madison WI
Porcine Hapmap panelBreed Illumina WU TotalLandrace 79 33 112Large White 129 28 157Duroc 82 16 98Pietrain 95 17 1112Wild Boar 20 249 26912 Chinese breeds 31 233 2643 Synthetic 6 49 55Hampshire 69 8 77Berkshire 58 - 58Museum samples - 28 28Other suiformes 6 41 47Other european br. - 289 289Tobasco 1 - 1TOTAL 576 991 1567
Sus celebensisSus verrucosusSus barbatusPecari tajacuPotomochoerusBabyrousa Sus cebifrons
Including 165 animals of the discovery panel
Animal sequenced
Animal Breeding & Genomics Centre
Illumina Agriculture Seminar SeriesAugust 3, 2009, Madison WI
Use of the 60K chip: European vs Chinese breeds
Biased towards common variantsBiased towards SNPs in European breedsSNP density of the 60K chip
Sufficient for GWA and GWS in european breedsToo low for Chinese breeds
Large White Ningxiang
Amaral et al. (2008) Genetics 179: 569
Animal Breeding & Genomics Centre
Illumina Agriculture Seminar SeriesAugust 3, 2009, Madison WI
Domestication of the pig
Greger Larson et al. (2005) Science 307:1618
Gene flowSelective sweepsHaplotype diversityHaplotype sharingExtent of LD
mtDNA analysis
Animal Breeding & Genomics Centre
Multi dimensional scaling and clustering
Animal Breeding & Genomics Centre
Illumina Agriculture Seminar SeriesAugust 3, 2009, Madison WI
MDS plot Axis1-2: Division Europe-Asia (PLINK)
Outgroup(other suidae)Asian breeds
European/US breeds
Asian WB
European WB
European WB
Animal Breeding & Genomics Centre
Illumina Agriculture Seminar SeriesAugust 3, 2009, Madison WI
MDS plot Axis1-2: Division Europe-Asia (PLINK)
WB
Museum sampleCollected in turkey 1964
Museum sample from Indonesia (~ 100 y old)
Animal Breeding & Genomics Centre
OutgroupsOther suidae
Animal Breeding & Genomics Centre
Sus scrofa Sus barbatus (11)
Sus celebensis (2)
Sus verrucosus (10)
Other Suidae
Sus cebrifons (1)
~ 1 MY
~ 2 MY
~ 4 MYPotomochoerus p. (1)
Babyrousa (1)~ 10 MY
Potomochoerus l. (1)
Pecari tajacu (1) ~ 70 MY
Phacochoerus a. (2)
Animal Breeding & Genomics Centre
Illumina Agriculture Seminar SeriesAugust 3, 2009, Madison WI
SNPs shared with other suidae
765438
578
--
5837
59167
1167
2306
Species Average heter.
Sus scrofa 23 %
Sus barbatus 3 %
Sus celebensis 2 %
Sus cebifrons 0.8 %
Sus verrucosus 0.6 %
Phacochoerus 0.8 %
Potomochoerus l. 0.7 %
Potomochoerus p. 0.9 %
Babyrousa b. 0.9 %
478
Ancestral allele identified for > 95% of the SNPs
496
Animal Breeding & Genomics Centre
Illumina Agriculture Seminar SeriesAugust 3, 2009, Madison WI
Haplotype diversity, haplotype sharing and
selective sweeps
Animal Breeding & Genomics Centre
Sharing is high between populations of the same breed, but not uniform across chromosomes
Duroc/Meishan
Sharing is low between Chinese and European breeds, but local exceptions exist
3567 SNPs on SSC7, nearly complete, 133 Mb
Duroc/DurocPietrain/Pietrain
Haplotype sharing between pig populations
Large W/LandraceLarge W/Pietrain
Sharing is intermediate between populations of several 'white' breeds
Animal Breeding & Genomics Centre
High Low (1 haplotype)
Haplotype diversity and linkage disequilibrium10 MbSSC7
LWLRPI
MS
3.5 MbDu
SSC14
DULWLRPIMS
Animal Breeding & Genomics Centre
Illumina Agriculture Seminar SeriesAugust 3, 2009, Madison WI
Identification of selective sweeps: Examples on SSC1-4LW27
LWxPI
HA01
DU24
DU22
DU20
SSC1 SSC4SSC3SSC2
Low heterozygosityHigh frequency of derived allele
Animal Breeding & Genomics Centre
Illumina Agriculture Seminar SeriesAugust 3, 2009, Madison WI
From SNP typing to resequencing
Animal Breeding & Genomics Centre
Illumina Agriculture Seminar SeriesAugust 3, 2009, Madison WI
(1) Site frequency spectrum analysis based on RRL sequence data
RRL cover 5-10 % of the genomeCoverage differs between breeds4 commercial white breeds + Wild BoarData from poolsWithin 500 Kb windows estimate:
– Watterson’s estimator: θ = f(S, n). – Tajima’s estimator: π = f(S, n, freqs.).– Tajima’s D: D = (π-θ) / sd.– Fst measures population differentiation
reference genome
breed 1
breed 2
Animal Breeding & Genomics Centre
Illumina Agriculture Seminar SeriesAugust 3, 2009, Madison WI
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 200
5000
10000
15000
20000
25000
30000
35000
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 330
500
1000
1500
2000
2500
3000
3500
4000
4500
Reanalyse data per breed
SNP distributions
TransitionsTransversions
Minor allele count
Animal Breeding & Genomics Centre
Illumina Agriculture Seminar SeriesAugust 3, 2009, Madison WI
Example: Site frequency spectrum of Ssc8 derived from deep sequencing of pools
Pietrain Wild Boar
Selective sweep around the KIT locus in white breeds
Variation of nucleotide diversity ( Wθ̂ )
Animal Breeding & Genomics Centre
Illumina Agriculture Seminar SeriesAugust 3, 2009, Madison WI
Sequencing of Sus Verrucosus (Java warty pig)
6x on Illumina GA200, 500 and 3000 bp libraries
~2 My
Animal Breeding & Genomics Centre
Illumina Agriculture Seminar SeriesAugust 3, 2009, Madison WI
FundingUSDA – CSREESEU (Sabre, PigSNP)Institute for Pig Genetics (IPG)
Acknowledgements (1)
Food Quality and Safety
DNA samplesChina Agricultural University
Ning LiUniversity of Sassari, Italy
Massimo ScanduraStaff Institute, Japan
Naohiko OkumuraINRA-Toulouse, France
Alain DuvroMARC, USA
Gerry RohrerRoslin Institute, UK
Alan ArchibaldEU PigBiodiv1
Wageningen University, The NetherlandsRichard Crooijmans
University of Illinois, USALarry Schook
Aristotle University of Thessaloniki, GreeceCostas Triantaphyllidis
Istituto Zootecnico per la sardegna, ItalySara Casu
Research Centre for Biology – LIPI, Indonesia
Gono SemiadiHendrix Genetics (HG)Institute for pg genetics (IPG)Pig Improvement Company (PIC)
PTP, ItalyElisabetta Guiffra
REPROGEN, AustraliaJaime Gongora
Universidade de Lisboa, PortugalDeodália Dias
Universitat Autonoma BarcelonaMiguel Perez-Enciso
University of Aarhus, DenmarkChristian Bendixen
Durham University, UKGreger Larsen
Iowa State University, USAMax Rothschild
Animal Breeding & Genomics Centre
Illumina Agriculture Seminar SeriesAugust 3, 2009, Madison WI
Acknowledgements (2)Wageningen University, The Netherlands
Richard CrooijmansMarcos RamosAndreia FonsecaHinri KerstensHendrik-Jan Megens
University of Aarhus, DenmarkChristian BendixenJakob Hedegaard
USDA,ARS, MARC, USAGary RohrerTim SmithDan Nonneman
INRA , Toulouse, FranceDenis Milan
Roslin Institute, UKAlan ArchibaldAndy Law
Durham University, UKGreger Larsen
Purdue University, USABill Muir
University of Illinois, USALarry SchookJon Beever
USDA, ARS, Beltsville, USACurt Van Tassel
Iowa State University, USAMax RothschildZhiliang HuJames Reecy
Sanger Institute, UKCarol ChurcherRichard Clark
University of Missour-Columbia, USAJeremy TaylorBob Schnabel
Universitat Autonoma BarcelonaMiguel Perez-EncisoLucca Ferreti
Illumina Inc., San Diego, USAMark HansenMarylinn Munson
Top Related