SNPs and the Human Genome Prof. Sorin Istrail. A SNP is a position in a genome at which two or more...
-
Upload
suzan-smith -
Category
Documents
-
view
217 -
download
1
Transcript of SNPs and the Human Genome Prof. Sorin Istrail. A SNP is a position in a genome at which two or more...
SNPs and the Human Genome
Prof. Sorin Istrail
A SNP is a position in a genome at which two or more different bases occur in the population, each with a frequency >1%.
GATTTAGATCGCGATAGAGGATTTAGATCTCGATAGAG
Single Nucleotide Polymorphism (SNP)
• The most abundant type of polymorphism
The two alleles at the site are G and T
tttctccatttgtcgtgacacctttgttgacaccttcatttctgcattctcaattctatttcactggtctatggcagagaacacaaaatatggccagtggcctaaatccagcctactaccttttttttttttttgtaacattttactaacatagccattcccatgtgtttccatgtgtctgggctgcttttgcactctaatggcagagttaagaaattgtagcagagaccacaatgcctcaaatatttactctacagccctttataaaaacagtgtgccaactcctgatttatgaacttatcattatgtcaataccatactgtctttattactgtagttttataagtcatgacatcagataatgtaaatcctccaactttgtttttaatcaaaagtgttttggccatcctagatatactttgtattgccacataaatttgaagatcagcctgtcagtgtctacaaaatagcatgctaggattttgatagggattgtgtagaatctatagattaattagaggagaatgactatcttgacaatactgctgcccctctgtattcgtgggggattggttccacaacaacacccaccccccactcggcaacccctgaaacccccacatcccccagcttttttcccctgctaccaaaatccatggatgctcaagtccatataaaatgccatactatttgcatataacctctgcaatcctcccctatagtttagatcatctctagattacttataatactaataaaatctaaatgctatgtaaatagttgctatactgtgttgagggttttttgttttgttttgttttatttgtttgtttgtttgtattttaagagatggtgtcttgctttgttgcccaggctggagtgcagtggtgagatcatagcttactgcagcctcaaactcctggactcaaacagtcctcccacctcagcctcccaaagtgctgggatacaggtgtgacccactgtgcccagttattattttttatttgtattattttactgttgtattatttttaattattttttctgaatattttccatctatagttggttgaatcatggatgtggaacaggcaaatatggagggctaactgtattgcatcttccagttcatgagtatgcagtctctctgtttatttaaagttttagtttttctcaaccatgtttacttttcagtatacaagactttgacgttttttgttaaatgtatttgtaagtattttattatttgtgatgttatttaaaaagaaattgttgactgggcacagtggctcacgcctgtaatcccagcactttgggaggctgaggcgggcagatcacgaggtcaggagatcaagaccatcctggctaacatggtaaaaccccgtctctactaaaaatagaaaaaaattagccaggcgtggtggcgagtgcctgtagtcccagctactcgggaggctgaggcaggagaatggtgtgaacctgggaggcggagcttgcagtgagctgagatcgtgccactgcattccagcctgcgtgacagagcgagactctgtcaaaaaaataaataaaatttaaaaaaagaagaagaaattattttcttaatttcattttcaggttttttatttatttctactatatggatacatgattgatttttgtatattgatcatgtatcctgcaaactagctaacatagtttattatttctctttttttgtggattttaaaggattttctacatagataaataaacacacataaacagttttacttctttcttttcaacctagactggatgcattttttgtttttgtttgtttgtttgctttttaacttgctgcagtgactagagaatgtattgaagaatatattgttgaacaaaagcagtgagagtggacatccctgctttccccctgattttagggggaatgttttcagtctttcactatttaatatgattttagctataggtttatcctagatccctgttatcatgttgaggaaattcccttctatttctagtttgttgagattttttaattcatgtgattgcgctatctggctttgctctca
tc
ga
ga
ga
ga
ga
gc
gc
gc
tc
ga
ga
ga
ga
ga
tc
tc
tc
tc
ga
ga
ga
tc
gc
tc
tc
tc
Human Genome contains ~ 3 G basepairs arranged in 46 chromosomes.
Two individuals are 99.9% the same. I.e. differ in ~ 3 M basepairs.
SNPs occur once every ~600 bp
Average gene in the human
genome spans ~27Kb
~50 SNPs per gene
G C T C G A C A A C A GG T T C G T C A A C A G
Two individuals
C A G HaplotypesT T G
SNP SNP SNP
Haplotype
Mutations
Infinite Sites Assumption:
Each site mutates at most once
Haplotype Pattern
0 0 0 01 1 0 10 0 1 00 1 0 1
C A G TT T G AC A T GC T G T
At each SNP site label the two alleles as 0 and 1.
The choice which allele is 0 and which one is 1
is arbitrary.
G T T C G A C T A T T A
G T T C G A C A A C A TA C G T A T C T A T T A
Recombination
G T T C G A C T A T T A
G T T C G A C A A C A TA C G T A T C T A T T A
The two alleles are linked, I.e., they are “traveling together”
?
Recombinationdisrupts the linkage
Recombination
Variations in Chromosomes Within a Population
Common Ancestor
Emergence of Variations Over Time
time present
Disease Mutation
Linkage Disequilibrium (LD)
Time = present
2,000 gens. ago
Disease-Causing Mutation
1,000 gens. ago
Extent of Linkage Disequilibrium