SNPs and the Human Genome Prof. Sorin Istrail. A SNP is a position in a genome at which two or more...

10
SNPs and the Human Genome Prof. Sorin Istrail

Transcript of SNPs and the Human Genome Prof. Sorin Istrail. A SNP is a position in a genome at which two or more...

Page 1: SNPs and the Human Genome Prof. Sorin Istrail. A SNP is a position in a genome at which two or more different bases occur in the population, each with.

SNPs and the Human Genome

Prof. Sorin Istrail

Page 2: SNPs and the Human Genome Prof. Sorin Istrail. A SNP is a position in a genome at which two or more different bases occur in the population, each with.

A SNP is a position in a genome at which two or more different bases occur in the population, each with a frequency >1%.

GATTTAGATCGCGATAGAGGATTTAGATCTCGATAGAG

Single Nucleotide Polymorphism (SNP)

• The most abundant type of polymorphism

The two alleles at the site are G and T

Page 3: SNPs and the Human Genome Prof. Sorin Istrail. A SNP is a position in a genome at which two or more different bases occur in the population, each with.

tttctccatttgtcgtgacacctttgttgacaccttcatttctgcattctcaattctatttcactggtctatggcagagaacacaaaatatggccagtggcctaaatccagcctactaccttttttttttttttgtaacattttactaacatagccattcccatgtgtttccatgtgtctgggctgcttttgcactctaatggcagagttaagaaattgtagcagagaccacaatgcctcaaatatttactctacagccctttataaaaacagtgtgccaactcctgatttatgaacttatcattatgtcaataccatactgtctttattactgtagttttataagtcatgacatcagataatgtaaatcctccaactttgtttttaatcaaaagtgttttggccatcctagatatactttgtattgccacataaatttgaagatcagcctgtcagtgtctacaaaatagcatgctaggattttgatagggattgtgtagaatctatagattaattagaggagaatgactatcttgacaatactgctgcccctctgtattcgtgggggattggttccacaacaacacccaccccccactcggcaacccctgaaacccccacatcccccagcttttttcccctgctaccaaaatccatggatgctcaagtccatataaaatgccatactatttgcatataacctctgcaatcctcccctatagtttagatcatctctagattacttataatactaataaaatctaaatgctatgtaaatagttgctatactgtgttgagggttttttgttttgttttgttttatttgtttgtttgtttgtattttaagagatggtgtcttgctttgttgcccaggctggagtgcagtggtgagatcatagcttactgcagcctcaaactcctggactcaaacagtcctcccacctcagcctcccaaagtgctgggatacaggtgtgacccactgtgcccagttattattttttatttgtattattttactgttgtattatttttaattattttttctgaatattttccatctatagttggttgaatcatggatgtggaacaggcaaatatggagggctaactgtattgcatcttccagttcatgagtatgcagtctctctgtttatttaaagttttagtttttctcaaccatgtttacttttcagtatacaagactttgacgttttttgttaaatgtatttgtaagtattttattatttgtgatgttatttaaaaagaaattgttgactgggcacagtggctcacgcctgtaatcccagcactttgggaggctgaggcgggcagatcacgaggtcaggagatcaagaccatcctggctaacatggtaaaaccccgtctctactaaaaatagaaaaaaattagccaggcgtggtggcgagtgcctgtagtcccagctactcgggaggctgaggcaggagaatggtgtgaacctgggaggcggagcttgcagtgagctgagatcgtgccactgcattccagcctgcgtgacagagcgagactctgtcaaaaaaataaataaaatttaaaaaaagaagaagaaattattttcttaatttcattttcaggttttttatttatttctactatatggatacatgattgatttttgtatattgatcatgtatcctgcaaactagctaacatagtttattatttctctttttttgtggattttaaaggattttctacatagataaataaacacacataaacagttttacttctttcttttcaacctagactggatgcattttttgtttttgtttgtttgtttgctttttaacttgctgcagtgactagagaatgtattgaagaatatattgttgaacaaaagcagtgagagtggacatccctgctttccccctgattttagggggaatgttttcagtctttcactatttaatatgattttagctataggtttatcctagatccctgttatcatgttgaggaaattcccttctatttctagtttgttgagattttttaattcatgtgattgcgctatctggctttgctctca

tc

ga

ga

ga

ga

ga

gc

gc

gc

tc

ga

ga

ga

ga

ga

tc

tc

tc

tc

ga

ga

ga

tc

gc

tc

tc

tc

Human Genome contains ~ 3 G basepairs arranged in 46 chromosomes.

Two individuals are 99.9% the same. I.e. differ in ~ 3 M basepairs.

SNPs occur once every ~600 bp

Average gene in the human

genome spans ~27Kb

~50 SNPs per gene

Page 4: SNPs and the Human Genome Prof. Sorin Istrail. A SNP is a position in a genome at which two or more different bases occur in the population, each with.

G C T C G A C A A C A GG T T C G T C A A C A G

Two individuals

C A G HaplotypesT T G

SNP SNP SNP

Haplotype

Page 5: SNPs and the Human Genome Prof. Sorin Istrail. A SNP is a position in a genome at which two or more different bases occur in the population, each with.

Mutations

Infinite Sites Assumption:

Each site mutates at most once

Page 6: SNPs and the Human Genome Prof. Sorin Istrail. A SNP is a position in a genome at which two or more different bases occur in the population, each with.

Haplotype Pattern

0 0 0 01 1 0 10 0 1 00 1 0 1

C A G TT T G AC A T GC T G T

At each SNP site label the two alleles as 0 and 1.

The choice which allele is 0 and which one is 1

is arbitrary.

Page 7: SNPs and the Human Genome Prof. Sorin Istrail. A SNP is a position in a genome at which two or more different bases occur in the population, each with.

G T T C G A C T A T T A

G T T C G A C A A C A TA C G T A T C T A T T A

Recombination

Page 8: SNPs and the Human Genome Prof. Sorin Istrail. A SNP is a position in a genome at which two or more different bases occur in the population, each with.

G T T C G A C T A T T A

G T T C G A C A A C A TA C G T A T C T A T T A

The two alleles are linked, I.e., they are “traveling together”

?

Recombinationdisrupts the linkage

Recombination

Page 9: SNPs and the Human Genome Prof. Sorin Istrail. A SNP is a position in a genome at which two or more different bases occur in the population, each with.

Variations in Chromosomes Within a Population

Common Ancestor

Emergence of Variations Over Time

time present

Disease Mutation

Linkage Disequilibrium (LD)

Page 10: SNPs and the Human Genome Prof. Sorin Istrail. A SNP is a position in a genome at which two or more different bases occur in the population, each with.

Time = present

2,000 gens. ago

Disease-Causing Mutation

1,000 gens. ago

Extent of Linkage Disequilibrium