Overview of SNP Genotyping Debbie Nickerson Department of Genome Sciences University of Washington...
-
date post
19-Dec-2015 -
Category
Documents
-
view
226 -
download
7
Transcript of Overview of SNP Genotyping Debbie Nickerson Department of Genome Sciences University of Washington...
![Page 1: Overview of SNP Genotyping Debbie Nickerson Department of Genome Sciences University of Washington debnick@u.washington.edu.](https://reader034.fdocuments.in/reader034/viewer/2022042615/56649d3f5503460f94a18407/html5/thumbnails/1.jpg)
Overview of SNP Genotyping
Debbie Nickerson
Department of Genome SciencesUniversity of Washington
![Page 2: Overview of SNP Genotyping Debbie Nickerson Department of Genome Sciences University of Washington debnick@u.washington.edu.](https://reader034.fdocuments.in/reader034/viewer/2022042615/56649d3f5503460f94a18407/html5/thumbnails/2.jpg)
SNP Genotyping - Overview
• Project Rationale
• Genotyping Strategies/Technical Leaps
• Data Management/Quality Control
![Page 3: Overview of SNP Genotyping Debbie Nickerson Department of Genome Sciences University of Washington debnick@u.washington.edu.](https://reader034.fdocuments.in/reader034/viewer/2022042615/56649d3f5503460f94a18407/html5/thumbnails/3.jpg)
SNP Project Rationale
• Heritability
• Power - Number of Individuals
• Number of SNPs - Candidate Gene, Pathway, Genome
5-10 SNPs, 400 to 1,000, 10K, 500K
• DNA requirements
• Cost
![Page 4: Overview of SNP Genotyping Debbie Nickerson Department of Genome Sciences University of Washington debnick@u.washington.edu.](https://reader034.fdocuments.in/reader034/viewer/2022042615/56649d3f5503460f94a18407/html5/thumbnails/4.jpg)
C Allele T AlleleProbe and Target
SNP Genotyping
Target G AC C
C
Hybridize Fail to hybridizeAllele-Specific Hybridization
CC C
Target G ADegrade Fail to degrade
Taqman
Matched Mis-Matched
EclipseDashMolecular BeaconAffymetrix
CTarget G A
C incorporated C Fails to incorporate
+ddCTP
Polymerase Extension
Fluorescence PolarizationSequenom
Target G ACC
C
Ligate Fail to ligateOligonucleotide Ligation SNPlex
ParalleleIllumina
![Page 5: Overview of SNP Genotyping Debbie Nickerson Department of Genome Sciences University of Washington debnick@u.washington.edu.](https://reader034.fdocuments.in/reader034/viewer/2022042615/56649d3f5503460f94a18407/html5/thumbnails/5.jpg)
SNP Typing Formats
Microtiter Plates - Fluorescence
Size Analysis by Electrophoresis
Arrays - Custom or Universal
eg. Taqman - Good for a few markers - lots of samples - PCR prior to genotyping
eg. SNPlex - Intermediate Multiplexing reduces costs - Genotype directly on
genomic DNA - new paradigm for high throughput
eg. Illumina, ParAllele, Affymetrics - Highly multiplexed- 96, 1,500 SNPs and beyond (500K+)
Low
Medium
High
Scale
![Page 6: Overview of SNP Genotyping Debbie Nickerson Department of Genome Sciences University of Washington debnick@u.washington.edu.](https://reader034.fdocuments.in/reader034/viewer/2022042615/56649d3f5503460f94a18407/html5/thumbnails/6.jpg)
Defining the scale of the genotyping project is key to selecting an approach:
5 to 10 SNPs in a candidate gene - Many approaches (expensive ~ 0.60 per SNP/genotype)
48 ( to 96) SNPs in a handful of candidate genes (~ 0.25 to 0.30 per SNP/genotype)
384 - 1,536 SNPs - cost reductions based on scale (~0.08 - 0.15 per SNP/genotype)
300,000 to 500,000 SNPs defined format(~0.002 per SNP/ genotype)
10,000-20,000 SNPs - defined and custom formats(~0.03 per SNP/genotype)
1000 individuals
$6,000
$~29,000
$57,600-122,880
$800,000
$>250,000
![Page 7: Overview of SNP Genotyping Debbie Nickerson Department of Genome Sciences University of Washington debnick@u.washington.edu.](https://reader034.fdocuments.in/reader034/viewer/2022042615/56649d3f5503460f94a18407/html5/thumbnails/7.jpg)
Many Approaches to Genotype a Handful of SNPs
• Taqman
• Single base extension - Fluorescence Polarization Sequenom - Mass Spec
• Eclipse
• Dash
• Molecular Beacons
PCR region prior to SNP genotyping - Adds to cost - Many use modified primers - the more modified, the higher the cost
![Page 8: Overview of SNP Genotyping Debbie Nickerson Department of Genome Sciences University of Washington debnick@u.washington.edu.](https://reader034.fdocuments.in/reader034/viewer/2022042615/56649d3f5503460f94a18407/html5/thumbnails/8.jpg)
Taqman
Genotyping with fluorescence-based homogenous assays (single-tube assay) = 1 SNP/ tube
![Page 9: Overview of SNP Genotyping Debbie Nickerson Department of Genome Sciences University of Washington debnick@u.washington.edu.](https://reader034.fdocuments.in/reader034/viewer/2022042615/56649d3f5503460f94a18407/html5/thumbnails/9.jpg)
Genotype Calling - Cluster Analysis
SNP 1252 - C
SN
P 1
252
- T
![Page 10: Overview of SNP Genotyping Debbie Nickerson Department of Genome Sciences University of Washington debnick@u.washington.edu.](https://reader034.fdocuments.in/reader034/viewer/2022042615/56649d3f5503460f94a18407/html5/thumbnails/10.jpg)
Genotyping by Mass Spectrometry - 24
![Page 11: Overview of SNP Genotyping Debbie Nickerson Department of Genome Sciences University of Washington debnick@u.washington.edu.](https://reader034.fdocuments.in/reader034/viewer/2022042615/56649d3f5503460f94a18407/html5/thumbnails/11.jpg)
Technological Leap - No advance PCR
Universal PCR after preparing multiple regions for analysis -
Several based on primer specific on genomic DNA followed by PCR of the ligated products - different strategiesand different readouts.
SNPlex, Illumina, Parallele (Affymetrix)
Also, reduced representation - Affymetrix - cut with restriction enzyme, then ligate linkers and amplify from linkers and follow by chiphybridization to read out.
![Page 12: Overview of SNP Genotyping Debbie Nickerson Department of Genome Sciences University of Washington debnick@u.washington.edu.](https://reader034.fdocuments.in/reader034/viewer/2022042615/56649d3f5503460f94a18407/html5/thumbnails/12.jpg)
SNPlex Assay - 48 SNPs
Locus Specific Sequence
Ligation Product Formed
Universal PCR Priming site
ZipCode1
Allele Specific Sequence
Universal PCR Priming site
A
GGenomic DNATarget
C
C G
P1. Ligation
2. Clean-up
ZipCode2
(Homozygote shown in this case)
A
P
![Page 13: Overview of SNP Genotyping Debbie Nickerson Department of Genome Sciences University of Washington debnick@u.washington.edu.](https://reader034.fdocuments.in/reader034/viewer/2022042615/56649d3f5503460f94a18407/html5/thumbnails/13.jpg)
3. Multiplexed Universal PCRUniv. PCR Primer
Univ. PCR Primer
Biotin
(Streptavidin)
4. Capture double stranded DNA- microtiter plate
PCR & ZipChute Hybridization
5. Denature double stranded DNA6. Wash away one strand
•7. Zip Chute Hybridization
![Page 14: Overview of SNP Genotyping Debbie Nickerson Department of Genome Sciences University of Washington debnick@u.washington.edu.](https://reader034.fdocuments.in/reader034/viewer/2022042615/56649d3f5503460f94a18407/html5/thumbnails/14.jpg)
9. Characterize on Capillary Sequencer
Detection
SNP 1
SNP 2
![Page 15: Overview of SNP Genotyping Debbie Nickerson Department of Genome Sciences University of Washington debnick@u.washington.edu.](https://reader034.fdocuments.in/reader034/viewer/2022042615/56649d3f5503460f94a18407/html5/thumbnails/15.jpg)
SNPlex Readout
NZipchute1 Position 1
NNZipchute2 Position 2
NNNZipchute3 Position 3
N(n)ZipChuten Position n
n ~ 48/lane
~2000 lanes/day
~96,000 genotypes/day
C
A
T
T
![Page 16: Overview of SNP Genotyping Debbie Nickerson Department of Genome Sciences University of Washington debnick@u.washington.edu.](https://reader034.fdocuments.in/reader034/viewer/2022042615/56649d3f5503460f94a18407/html5/thumbnails/16.jpg)
Locus 1 Specific Sequence
cTag1 sequenceTag1 sequence
SubstrateBead or Chip
Tag 1
Tag 2
Tag 3
Tag 4
Chip ArrayBead Array
Multiplexed Genotyping - Universal Tag Readouts
Locus 2 Specific Sequence
cTag2 sequenceTag2 sequence
SubstrateBead or Chip
C T A G
Multiplex ~96 - 20,000 SNPs
Not dependent on primary PCRIllumina ParAllele
Affymetrix
![Page 17: Overview of SNP Genotyping Debbie Nickerson Department of Genome Sciences University of Washington debnick@u.washington.edu.](https://reader034.fdocuments.in/reader034/viewer/2022042615/56649d3f5503460f94a18407/html5/thumbnails/17.jpg)
Arrays - High Density Genotyping Thousands of SNPs and Beyond
• “Bead” Arrays - Illumina– Manufactured by self-assembly
– Beads identified by decoding
![Page 18: Overview of SNP Genotyping Debbie Nickerson Department of Genome Sciences University of Washington debnick@u.washington.edu.](https://reader034.fdocuments.in/reader034/viewer/2022042615/56649d3f5503460f94a18407/html5/thumbnails/18.jpg)
Sentrix™ Platform
Sentrix™ 96 Multi-array Matrix matches standard microtiter plates (96 - 1536 SNPs/well)Up to ~140,000 assays per matrix
![Page 19: Overview of SNP Genotyping Debbie Nickerson Department of Genome Sciences University of Washington debnick@u.washington.edu.](https://reader034.fdocuments.in/reader034/viewer/2022042615/56649d3f5503460f94a18407/html5/thumbnails/19.jpg)
Fluorescent Image of BeadArray
~ 3 micron diameter beads
~ 5 micron center-to-center
~50,000 features on ~1.5 mm diameter bundle
Currently: up to 1,536 SNPs genotyped per bundle - at least 30 beads per code - many internal replicates
![Page 20: Overview of SNP Genotyping Debbie Nickerson Department of Genome Sciences University of Washington debnick@u.washington.edu.](https://reader034.fdocuments.in/reader034/viewer/2022042615/56649d3f5503460f94a18407/html5/thumbnails/20.jpg)
Universal forwardSequences (1, 2)
Allele specificSequence
Illumina Assay - 3 Primers per SNP
Genomic DNA template
SNP
(1-20 nt gap)
Illumicode ™Sequence tag
Universal reversesequence
Locus specificSequence
5’3’
5’3’G
A
C
T
![Page 21: Overview of SNP Genotyping Debbie Nickerson Department of Genome Sciences University of Washington debnick@u.washington.edu.](https://reader034.fdocuments.in/reader034/viewer/2022042615/56649d3f5503460f94a18407/html5/thumbnails/21.jpg)
Allele-Specific Extension and Ligation
Genomic DNA [T/C]
AGAllele Specific
Extension &Ligation
Universal PCR Sequence 1
Universal PCR Sequence 2
Universal PCR Sequence 3’
illumiCode’ Address
Ligase
[T/A]
Polymerase
![Page 22: Overview of SNP Genotyping Debbie Nickerson Department of Genome Sciences University of Washington debnick@u.washington.edu.](https://reader034.fdocuments.in/reader034/viewer/2022042615/56649d3f5503460f94a18407/html5/thumbnails/22.jpg)
A illumiCode #561Amplification Template
GoldenGate™ AssayAmplification
PCR with Common Primers
Cy3 Universal Primer 1
Cy5 Universal Primer 2
Universal Primer P3
![Page 23: Overview of SNP Genotyping Debbie Nickerson Department of Genome Sciences University of Washington debnick@u.washington.edu.](https://reader034.fdocuments.in/reader034/viewer/2022042615/56649d3f5503460f94a18407/html5/thumbnails/23.jpg)
Hybridization to Universal IllumiCodeTM
/\/\/\
/
/\/\/\
/
/\/\/\
/
illumiCode #561
illumiCode #217
illumiCode #1024
/\/\/\
/
/\/\/\
/
A/A T/T C/T
![Page 24: Overview of SNP Genotyping Debbie Nickerson Department of Genome Sciences University of Washington debnick@u.washington.edu.](https://reader034.fdocuments.in/reader034/viewer/2022042615/56649d3f5503460f94a18407/html5/thumbnails/24.jpg)
BeadArray Reader
• Confocal laser scanning system• Resolution, 0.8 micron• Two lasers 532, 635 nm
– Supports Cy3 & Cy5 imaging
• Sentrix Arrays (96 bundle) and Slides for 100k fixed
formats
![Page 25: Overview of SNP Genotyping Debbie Nickerson Department of Genome Sciences University of Washington debnick@u.washington.edu.](https://reader034.fdocuments.in/reader034/viewer/2022042615/56649d3f5503460f94a18407/html5/thumbnails/25.jpg)
![Page 26: Overview of SNP Genotyping Debbie Nickerson Department of Genome Sciences University of Washington debnick@u.washington.edu.](https://reader034.fdocuments.in/reader034/viewer/2022042615/56649d3f5503460f94a18407/html5/thumbnails/26.jpg)
![Page 27: Overview of SNP Genotyping Debbie Nickerson Department of Genome Sciences University of Washington debnick@u.washington.edu.](https://reader034.fdocuments.in/reader034/viewer/2022042615/56649d3f5503460f94a18407/html5/thumbnails/27.jpg)
Process Controls
Mismatch
Contamination
First Hyb
Gender
High AT/GC
Second Hyb
Gap
![Page 28: Overview of SNP Genotyping Debbie Nickerson Department of Genome Sciences University of Washington debnick@u.washington.edu.](https://reader034.fdocuments.in/reader034/viewer/2022042615/56649d3f5503460f94a18407/html5/thumbnails/28.jpg)
Illumina Readout for Sentrix Array> 1,000 SNPs Assayed on 96 Samples
![Page 29: Overview of SNP Genotyping Debbie Nickerson Department of Genome Sciences University of Washington debnick@u.washington.edu.](https://reader034.fdocuments.in/reader034/viewer/2022042615/56649d3f5503460f94a18407/html5/thumbnails/29.jpg)
Locus 1 Specific Sequence
cTag1 sequenceTag1 sequence
SubstrateBead or Chip
Tag 1
Tag 2
Tag 3
Tag 4
Chip ArrayBead Array
Multiplexed Genotyping - Universal Tag Readouts
Locus 2 Specific Sequence
cTag2 sequenceTag2 sequence
SubstrateBead or Chip
C T A G
Multiplex ~96 - 20,000 SNPs
Not dependent on primary PCRIllumina ParAllele
Affymetrix
![Page 30: Overview of SNP Genotyping Debbie Nickerson Department of Genome Sciences University of Washington debnick@u.washington.edu.](https://reader034.fdocuments.in/reader034/viewer/2022042615/56649d3f5503460f94a18407/html5/thumbnails/30.jpg)
Parallele - Defined and Custom Formats
- Intermediate Strategy
- Multiplex ~ 20,000 SNPs
- Affymetrix readout Universal Arrays
![Page 31: Overview of SNP Genotyping Debbie Nickerson Department of Genome Sciences University of Washington debnick@u.washington.edu.](https://reader034.fdocuments.in/reader034/viewer/2022042615/56649d3f5503460f94a18407/html5/thumbnails/31.jpg)
Parallele Technology (MIP)
Molecular Inversion Probes (MIP)
![Page 32: Overview of SNP Genotyping Debbie Nickerson Department of Genome Sciences University of Washington debnick@u.washington.edu.](https://reader034.fdocuments.in/reader034/viewer/2022042615/56649d3f5503460f94a18407/html5/thumbnails/32.jpg)
![Page 33: Overview of SNP Genotyping Debbie Nickerson Department of Genome Sciences University of Washington debnick@u.washington.edu.](https://reader034.fdocuments.in/reader034/viewer/2022042615/56649d3f5503460f94a18407/html5/thumbnails/33.jpg)
![Page 34: Overview of SNP Genotyping Debbie Nickerson Department of Genome Sciences University of Washington debnick@u.washington.edu.](https://reader034.fdocuments.in/reader034/viewer/2022042615/56649d3f5503460f94a18407/html5/thumbnails/34.jpg)
Affymetrix’s Chip
![Page 35: Overview of SNP Genotyping Debbie Nickerson Department of Genome Sciences University of Washington debnick@u.washington.edu.](https://reader034.fdocuments.in/reader034/viewer/2022042615/56649d3f5503460f94a18407/html5/thumbnails/35.jpg)
![Page 36: Overview of SNP Genotyping Debbie Nickerson Department of Genome Sciences University of Washington debnick@u.washington.edu.](https://reader034.fdocuments.in/reader034/viewer/2022042615/56649d3f5503460f94a18407/html5/thumbnails/36.jpg)
Whole Genome Association Strategies
Two Platforms Available Different Designs
- Affymetrix
- Illumina
![Page 37: Overview of SNP Genotyping Debbie Nickerson Department of Genome Sciences University of Washington debnick@u.washington.edu.](https://reader034.fdocuments.in/reader034/viewer/2022042615/56649d3f5503460f94a18407/html5/thumbnails/37.jpg)
Affymetrix GeneChip Mapping 500K Array Set
![Page 38: Overview of SNP Genotyping Debbie Nickerson Department of Genome Sciences University of Washington debnick@u.washington.edu.](https://reader034.fdocuments.in/reader034/viewer/2022042615/56649d3f5503460f94a18407/html5/thumbnails/38.jpg)
Affymetrix Assays
100,000 to 500,000 fixed formats
Whole genome strategy
Not all regions - unique
Cheap but costly and throughput an issue
![Page 39: Overview of SNP Genotyping Debbie Nickerson Department of Genome Sciences University of Washington debnick@u.washington.edu.](https://reader034.fdocuments.in/reader034/viewer/2022042615/56649d3f5503460f94a18407/html5/thumbnails/39.jpg)
500K: Content Optimized SNP Selection
~2,200,000 SNPs~2,200,000 SNPsFrom Public & PerlegenFrom Public & Perlegen
~650K SNPs~650K SNPs
48 individualsCall rate, concordance
500K SNPs500K SNPs
400 samplesCall rate, accuracy
LD
• Initial Selection: 48 people– 2.2M SNPs– 25 million genotypes– 16 each Caucasian, African, Asian– All HapMap samples
• Maximize performance: Second selection over 400 people– 270 HapMap Samples– 130 diversity samples– Accuracy
• HW, Mendel error, reproducibility– Call rates
• Maximize information content: – Prioritize SNPs based on LD &
HapMap (Broad Institute)
![Page 40: Overview of SNP Genotyping Debbie Nickerson Department of Genome Sciences University of Washington debnick@u.washington.edu.](https://reader034.fdocuments.in/reader034/viewer/2022042615/56649d3f5503460f94a18407/html5/thumbnails/40.jpg)
The Assay - Details
http://www.affymetrix.com/products/arrays/specific/100k.affx
Optimized for 250-2000bp
![Page 41: Overview of SNP Genotyping Debbie Nickerson Department of Genome Sciences University of Washington debnick@u.washington.edu.](https://reader034.fdocuments.in/reader034/viewer/2022042615/56649d3f5503460f94a18407/html5/thumbnails/41.jpg)
40 probes are used per SNP
![Page 42: Overview of SNP Genotyping Debbie Nickerson Department of Genome Sciences University of Washington debnick@u.washington.edu.](https://reader034.fdocuments.in/reader034/viewer/2022042615/56649d3f5503460f94a18407/html5/thumbnails/42.jpg)
High Throughput Chip Formats
![Page 43: Overview of SNP Genotyping Debbie Nickerson Department of Genome Sciences University of Washington debnick@u.washington.edu.](https://reader034.fdocuments.in/reader034/viewer/2022042615/56649d3f5503460f94a18407/html5/thumbnails/43.jpg)
80% genome coverage of Mapping 500K
• 500K run on 270 HapMap samples
• Pairwise r2 analysis for common SNPs (MAF>0.05)
• Robust coverage across populations r2=0.8– CEPH, Asian ~66%– Yoruba ~45%
• 2 & 3 marker predictors (multimarker) further increase coverage
![Page 44: Overview of SNP Genotyping Debbie Nickerson Department of Genome Sciences University of Washington debnick@u.washington.edu.](https://reader034.fdocuments.in/reader034/viewer/2022042615/56649d3f5503460f94a18407/html5/thumbnails/44.jpg)
Mapping 500K Set
• >500K SNP’s– 2 array set
• Performance– 93-98% call rate range (>95% average)– >99.5% concordance with HapMap
Genotypes, 99.9% reproducibility• SNP lists, annotation and genotype data
available without restriction at Affymetrix.com
![Page 45: Overview of SNP Genotyping Debbie Nickerson Department of Genome Sciences University of Washington debnick@u.washington.edu.](https://reader034.fdocuments.in/reader034/viewer/2022042615/56649d3f5503460f94a18407/html5/thumbnails/45.jpg)
Illumina - Infinium I & II 10K - 300K
![Page 46: Overview of SNP Genotyping Debbie Nickerson Department of Genome Sciences University of Washington debnick@u.washington.edu.](https://reader034.fdocuments.in/reader034/viewer/2022042615/56649d3f5503460f94a18407/html5/thumbnails/46.jpg)
Infinium II AssaySingle Base Extension
Two haptens/colors
A
WGA target
UBead
C
Whole genome amplified DNA
SNP1 SNP2 SNP3 - - - - - SNP
A-DNP C-Bio C-BioA-DNP
A-DNP C-Bio
SNP1 SNP2 SNP3 - - - - - SNP
Signal Green/Red
TT GG G Ta
b
c
Whole genome amplified DNA
SNP1 SNP2 SNP3 - - - - - SNP
A-DNP C-Bio C-BioA-DNP
A-DNP C-Bio
SNP1 SNP2 SNP3 - - - - - SNP
Signal Green/Red
TT GG G Ta
b
c
![Page 47: Overview of SNP Genotyping Debbie Nickerson Department of Genome Sciences University of Washington debnick@u.washington.edu.](https://reader034.fdocuments.in/reader034/viewer/2022042615/56649d3f5503460f94a18407/html5/thumbnails/47.jpg)
HumanHap-1 Genotyping BeadChip Content
1. Examine HapMap Phase I SNPs with MAF > 0.05 in CEU
2. Bin SNPs in high LD with one another using ldSelect (Carlson, et al. 2004)
3. Select tag SNP with highest design score for each bin.
Maximize coverage of human variation by choosing tag SNPs to uniquely identify haplotypes.
Tag SNP selection process:
![Page 48: Overview of SNP Genotyping Debbie Nickerson Department of Genome Sciences University of Washington debnick@u.washington.edu.](https://reader034.fdocuments.in/reader034/viewer/2022042615/56649d3f5503460f94a18407/html5/thumbnails/48.jpg)
317K Content Selection
• Tag SNPs– r2 ≥ 0.80 for bins containing SNPs within 10kb of genes
or in evolutionarily conserved regions (ECRs)– r2 ≥ 0.70 for bins containing SNPs outside of genes or
ECRs.
• Additional Content– ~8,000 nsSNPs– ~1,500 tag SNPs selected from high density SNP data in
the MHC region
• Total 317,503 loci
![Page 49: Overview of SNP Genotyping Debbie Nickerson Department of Genome Sciences University of Washington debnick@u.washington.edu.](https://reader034.fdocuments.in/reader034/viewer/2022042615/56649d3f5503460f94a18407/html5/thumbnails/49.jpg)
HumanHap300 Genomic Coverage by Population
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
>0 >0.1 >0.2 >0.3 >0.4 >0.5 >0.6 >0.7 >0.8 >0.9
Max r2
Coverage of Phase I+II DataCEU, mean 0.87 median 0.97
CHB+JPT, mean 0.80 median 0.94
YRI, mean 0.57 median 0.55
![Page 50: Overview of SNP Genotyping Debbie Nickerson Department of Genome Sciences University of Washington debnick@u.washington.edu.](https://reader034.fdocuments.in/reader034/viewer/2022042615/56649d3f5503460f94a18407/html5/thumbnails/50.jpg)
![Page 51: Overview of SNP Genotyping Debbie Nickerson Department of Genome Sciences University of Washington debnick@u.washington.edu.](https://reader034.fdocuments.in/reader034/viewer/2022042615/56649d3f5503460f94a18407/html5/thumbnails/51.jpg)
HumanHap300 Data Quality127 samples
25 trios15 replicates
Parameter Percent
Call rate 99.93%
Reproducibility >99.99%
Mendelian Inconsistencies
0.035%
Concordance withHapMap Data
99.69%
![Page 52: Overview of SNP Genotyping Debbie Nickerson Department of Genome Sciences University of Washington debnick@u.washington.edu.](https://reader034.fdocuments.in/reader034/viewer/2022042615/56649d3f5503460f94a18407/html5/thumbnails/52.jpg)
HumanHap500 Content Strategy
– Analysis of full HapMap data set (Phase I + II) using HumanHap300 SNP list
– Fill in regions of low LD requiring higher density of tag SNPs
– Content Strategy• r2 ≥ 0.80 for bins containing SNPs within 10kb of genes or in
evolutionarily conserved regions (ECRs) in CEU• r2 ≥ 0.70 for bins containing SNPs outside of genes or ECRs in CEU• r2 ≥ 0.80 for large bins (≥ 3 SNPs) in CHB+JPT population• r2 ≥ 0.70 for large bins (≥ 5 SNPs) in YRI population
![Page 53: Overview of SNP Genotyping Debbie Nickerson Department of Genome Sciences University of Washington debnick@u.washington.edu.](https://reader034.fdocuments.in/reader034/viewer/2022042615/56649d3f5503460f94a18407/html5/thumbnails/53.jpg)
Preliminary HumanHap500 Genomic Coverage by Population
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
>0 >0.1 >0.2 >0.3 >0.4 >0.5 >0.6 >0.7 >0.8 >0.9
Max r2
Coverage of Phase I+II Loci
CEU, mean 0.95 median 1.0
CHB+JPT, mean 0.93 median 1.0
YRI, mean 0.75 median 0.88
![Page 54: Overview of SNP Genotyping Debbie Nickerson Department of Genome Sciences University of Washington debnick@u.washington.edu.](https://reader034.fdocuments.in/reader034/viewer/2022042615/56649d3f5503460f94a18407/html5/thumbnails/54.jpg)
Data Quality Control
• Estimating Error Rates
• Hardy Weinberg Equilibrium
• Frequency Analysis
• Missing Data
![Page 55: Overview of SNP Genotyping Debbie Nickerson Department of Genome Sciences University of Washington debnick@u.washington.edu.](https://reader034.fdocuments.in/reader034/viewer/2022042615/56649d3f5503460f94a18407/html5/thumbnails/55.jpg)
Measuring Error Rates
• Genotype replicate samples
• Error rates generally < <1%
• Error rates are SNP specific
Rep 1
CC CT TT
Rep 2 CC 24 1 0
CT 0 50 0
TT 0 0 25
![Page 56: Overview of SNP Genotyping Debbie Nickerson Department of Genome Sciences University of Washington debnick@u.washington.edu.](https://reader034.fdocuments.in/reader034/viewer/2022042615/56649d3f5503460f94a18407/html5/thumbnails/56.jpg)
Measuring Error Rates
• Genotype replicate samples
• Absolute number of replicates is more important than percentage– E.g. 10% of 200?
Rep 1
CC CT TT
Rep 2 CC 24 1 0
CT 0 50 0
TT 0 0 25
![Page 57: Overview of SNP Genotyping Debbie Nickerson Department of Genome Sciences University of Washington debnick@u.washington.edu.](https://reader034.fdocuments.in/reader034/viewer/2022042615/56649d3f5503460f94a18407/html5/thumbnails/57.jpg)
Replicate samples
• Replicates can also detect sample handling errors– Wrong plate– Plate rotation
![Page 58: Overview of SNP Genotyping Debbie Nickerson Department of Genome Sciences University of Washington debnick@u.washington.edu.](https://reader034.fdocuments.in/reader034/viewer/2022042615/56649d3f5503460f94a18407/html5/thumbnails/58.jpg)
Sample Handling Errors
• Sexing samples• Other known
genotypes– Blood type– HLA– Etc.
![Page 59: Overview of SNP Genotyping Debbie Nickerson Department of Genome Sciences University of Washington debnick@u.washington.edu.](https://reader034.fdocuments.in/reader034/viewer/2022042615/56649d3f5503460f94a18407/html5/thumbnails/59.jpg)
Hardy Weinberg Equilibrium
• Given – p = Allele 1 frequency– q = 1-p
• Expectations– p2 = frequency 11– 2pq = frequency 12– q2 = frequency 22
![Page 60: Overview of SNP Genotyping Debbie Nickerson Department of Genome Sciences University of Washington debnick@u.washington.edu.](https://reader034.fdocuments.in/reader034/viewer/2022042615/56649d3f5503460f94a18407/html5/thumbnails/60.jpg)
Hardy Weinberg Disequilibrium
• Heterozygote excess– Biologic
• Differential survival
– Technical• Nonspecific assays
Duplicated regions
• Homozygote excess– Biologic
• Population stratification• Null allele
– Technical• Allele dropout
![Page 61: Overview of SNP Genotyping Debbie Nickerson Department of Genome Sciences University of Washington debnick@u.washington.edu.](https://reader034.fdocuments.in/reader034/viewer/2022042615/56649d3f5503460f94a18407/html5/thumbnails/61.jpg)
HWE Example
Small et al, NEJM 2002 v347 p1135
![Page 62: Overview of SNP Genotyping Debbie Nickerson Department of Genome Sciences University of Washington debnick@u.washington.edu.](https://reader034.fdocuments.in/reader034/viewer/2022042615/56649d3f5503460f94a18407/html5/thumbnails/62.jpg)
Frequency Analysis
• Check haplotype and allele frequencies against standard populations
tagSNP Allele Frequency
0%
20%
40%
60%
80%
100%
0% 20% 40% 60% 80% 100%
PGA
IGAP
WhiteBlack
tagSNP Haplotype Frequency
0%
10%
20%
30%
40%
50%
0% 10% 20% 30% 40% 50%
PGA
IGAP
WhiteBlackEuropean
African
European
African
![Page 63: Overview of SNP Genotyping Debbie Nickerson Department of Genome Sciences University of Washington debnick@u.washington.edu.](https://reader034.fdocuments.in/reader034/viewer/2022042615/56649d3f5503460f94a18407/html5/thumbnails/63.jpg)
Data Quality Control
• Estimate Error Rates from Replicates
• Check Hardy Weinberg Equilibrium
• Check Allele and Haplotype Frequencies
• Check Missing Data - Site specific
Sample specific
![Page 64: Overview of SNP Genotyping Debbie Nickerson Department of Genome Sciences University of Washington debnick@u.washington.edu.](https://reader034.fdocuments.in/reader034/viewer/2022042615/56649d3f5503460f94a18407/html5/thumbnails/64.jpg)
Detection of Outliers of the Distribution
X-linked SNPX-linked SNP Unknown SNPUnknown SNP
Carlson et al, in preparation
![Page 65: Overview of SNP Genotyping Debbie Nickerson Department of Genome Sciences University of Washington debnick@u.washington.edu.](https://reader034.fdocuments.in/reader034/viewer/2022042615/56649d3f5503460f94a18407/html5/thumbnails/65.jpg)
SNP Genotyping -
Is it an intermediate stop
on the way towhole-genome
sequencing?
![Page 66: Overview of SNP Genotyping Debbie Nickerson Department of Genome Sciences University of Washington debnick@u.washington.edu.](https://reader034.fdocuments.in/reader034/viewer/2022042615/56649d3f5503460f94a18407/html5/thumbnails/66.jpg)
Long term sequencing - In situ approaches
Solexa - an example
![Page 67: Overview of SNP Genotyping Debbie Nickerson Department of Genome Sciences University of Washington debnick@u.washington.edu.](https://reader034.fdocuments.in/reader034/viewer/2022042615/56649d3f5503460f94a18407/html5/thumbnails/67.jpg)
SNP Genotyping Summary
1. Many different genotyping approaches are available - Low to high throughput
2. Some platforms permit users to pick custom SNPs but the highest throughputs are available only in fixed contents.
3. Not all custom SNPs will work for every format. Multiple formats will be required to carry out most projects targeting specific SNPs
4. There are still trade-offs for throughput - Samples vs. SNPs
5. Costs still dictate study design.
6. Regardless of the study - Design,quality control and tracking will rule the day! Laboratory Information Management Systems are key in every study design
(Key: Track - Samples, - Assays
- Completion rate - Reproducibility/Error Analysis)