CS177 Lecture 10 SNPs and Human Genetic Variation Tom Madej 11.21.05.

22
CS177 Lecture 10 SNPs and Human Genetic Variation Tom Madej 11.21.05

Transcript of CS177 Lecture 10 SNPs and Human Genetic Variation Tom Madej 11.21.05.

Page 1: CS177 Lecture 10 SNPs and Human Genetic Variation Tom Madej 11.21.05.

CS177 Lecture 10 SNPs and Human Genetic Variation

Tom Madej 11.21.05

Page 2: CS177 Lecture 10 SNPs and Human Genetic Variation Tom Madej 11.21.05.

Lecture overview

• Human genetic variation, HapMap project.

• Experimental methods: PCR, X-ray crystallography, microarrays.

Page 3: CS177 Lecture 10 SNPs and Human Genetic Variation Tom Madej 11.21.05.

Motivations to study human genetic variation

• The evolution of our species and its history.

• Understand the genetics of diseases, esp. the more common complex ones such as diabetes, cancer, cardiovascular, and neurodegenerative.

• To allow pharmaceutical treatments to be tailored to individuals (adverse reactions based on genetics).

Page 4: CS177 Lecture 10 SNPs and Human Genetic Variation Tom Madej 11.21.05.

Genetic variation

• The human genome has approximately 10 million polymorphisms, i.e. genetic variants that occur at the level of about 1% or more in the population.

• Many of these polymorphisms are SNPs, single nucleotide polymorphisms.

• These polymorphisms contribute to our individuality, and also influence our susceptibility to various diseases.

Page 5: CS177 Lecture 10 SNPs and Human Genetic Variation Tom Madej 11.21.05.

Mendelian and non-Mendelian diseases

• Geneticists have been very successful in discovering the variations due to Mendelian disorders. These are characterized by in that they follow the Mendelian rules of inheritance.

• The study of particular families using linkage analysis has been successful for the Mendelian diseases.

• However, the more common complex (i.e. non-Mendelian) disorders have been much more difficult to investigate, even there there are clearly genetic components to many of these diseases.

Page 6: CS177 Lecture 10 SNPs and Human Genetic Variation Tom Madej 11.21.05.

Sources of genetic variation (during meiosis)

• Chromosomal reassortment; a human has 23 pairs of chromosomes, one of each pair is inherited from the father, and the other one from the mother.

• Mutation; errors in DNA copying. This may result in SNPs or also larger portions of DNA may be duplicated or copied incorrectly.

• Genetic recombination; shuffling of segments between partner chromosomes of a pair.

Page 7: CS177 Lecture 10 SNPs and Human Genetic Variation Tom Madej 11.21.05.

Reassortment of genetic material during meiosis

Molecular Biology of the Cell, Alberts et al. Garland Publishing 2002 (Fig. 20-8)

Page 8: CS177 Lecture 10 SNPs and Human Genetic Variation Tom Madej 11.21.05.

Single Nucleotide Polymorphisms (SNPs)

• Major source of genetic variation.

• Estimated approx. 7 million SNPs that occur with frequencies at least 5% in the human population; approx. 11 million with frequencies at least 1%.

• Can we determine the associations between these variants and diseases?

Page 9: CS177 Lecture 10 SNPs and Human Genetic Variation Tom Madej 11.21.05.

Other types of genetic variations…

Page 10: CS177 Lecture 10 SNPs and Human Genetic Variation Tom Madej 11.21.05.

International HapMap project

• Haplotype – set of variants on a chromosome that tend to inherited as a block.

• Provide a collection of SNPs spanning the genome, and serving as genetic markers.

• Study correlations (linkage disequilibrium, LD) between the SNPs.

• Provide a guide for whole genome association studies.

Page 11: CS177 Lecture 10 SNPs and Human Genetic Variation Tom Madej 11.21.05.

HapMap project

• Project was launched in Oct 2002.

• In the first phase genotyped 1.1 million SNPs in 269 individuals from four ethnic origins.

• Second phase will genotype another 4.6 million SNPs.

• Goal was to find most SNPs that occur with frequencies of at least 5% in the human population.

Page 12: CS177 Lecture 10 SNPs and Human Genetic Variation Tom Madej 11.21.05.

Statistics digression: here is an example of a commonly used correlation measure…

Page 13: CS177 Lecture 10 SNPs and Human Genetic Variation Tom Madej 11.21.05.
Page 14: CS177 Lecture 10 SNPs and Human Genetic Variation Tom Madej 11.21.05.

LD and recombination hotspots

Page 15: CS177 Lecture 10 SNPs and Human Genetic Variation Tom Madej 11.21.05.

Correlated (LD) SNPs and tag SNPs

Nature Genetics: published online Oct 30, 2005; doi:10.1038/ng1688

Page 16: CS177 Lecture 10 SNPs and Human Genetic Variation Tom Madej 11.21.05.

Haplotype diversity

Nature, v. 437 Oct 27, 2005, p.1306

Page 17: CS177 Lecture 10 SNPs and Human Genetic Variation Tom Madej 11.21.05.

LD summary

• The human genome consists of regions of low polymorphism (i.e. low sequence variation) of sizes from 10-100 kb, interspersed with regions of high polymorphism.

• This seems to be due to “recombination hotspots” in the chromosomes.

• The inheritance of chromosomal regions without recombination (haplotypes) means that certain combinations of genes are widespread across the human population.

Page 18: CS177 Lecture 10 SNPs and Human Genetic Variation Tom Madej 11.21.05.

http://www.hapmap.org/

Page 19: CS177 Lecture 10 SNPs and Human Genetic Variation Tom Madej 11.21.05.

Exercise!

• Go to www.hapmap.org, and select “Browse Project Data” (link on the left).

• In the “Landmark or Region” box enter: DTNBP1, then click “Search”.

• Select the NM_032122 link (isoform a).• Take a look at the Overview and Details.• Go down to “Tracks”, select “Analysis All on”, and then

“Update Image”.• Take a look at the LD map, phased haplotypes, and list

of tag SNPs.

Page 20: CS177 Lecture 10 SNPs and Human Genetic Variation Tom Madej 11.21.05.

Whole genome association study

• Given a sample of people, some with and some without a certain trait/phenotype (e.g. a certain disease).

• Call the two sets cases and controls.

• Investigate the genetic factors shared by the cases, but absent from the controls; i.e. find the associations between the genetic factors and the disease.

• The most straightforward way: genotype all the individuals.

• But this is far too expensive with current technology!

Page 21: CS177 Lecture 10 SNPs and Human Genetic Variation Tom Madej 11.21.05.

The HapMap data is useful for whole genome association studies…

• The collection of SNPs give us common genetic markers.

• By using tag SNPs we can reduce the number of SNPs that need to be genotyped in the study.

• It is even possible to produce SNP chips with a few hundred thousand tag SNPs that can be used for the genotyping.

• But statistical studies need to be done!

Page 22: CS177 Lecture 10 SNPs and Human Genetic Variation Tom Madej 11.21.05.