Detection of positive selection in humane genome
description
Transcript of Detection of positive selection in humane genome
Detection of positive selection in humane
genome
Introduction
IntroductionBefore and after genome sequencing
Detection Methods
1.- High proportion of function-altering mutations
Sperm proteamine P1: Protamines are small, arginine-rich, nuclear proteins that replace histones late in the haploid phase of spermatogenesis and are believed essential for sperm head condensation and DNA stabilization
2.- Reduction in genetic diversity
Region with low diversity and excess of rare alleles
3.- High-frequency derived alleles
African populationsThought to be the result of selection for resistance to P.vivax malaria.
4.- Differences between populations
5.- Long haplotype
Results
Candidate region characteristics:
Mean length : 815kb Max length: 3.5Mb
Often contain multiple genes. Mean: 4 Max: 15
A typical region harbour 400-4000 common SNP (frec >5%) ¾ SNP database ½ Genotyped HapMap2
¿Which are the true signatures of positive selection?
• They performed a similar analysis on all the 22 candidate regions.
– 9166 SNPs associated with the long-haplotype signal (Long haplotype)
– 480 satisfied the two other criteria (Population differences and Derived allele)
– 41 (0’2% of all SNPs genotyped in the regions) possibly functional on the basis of newly compiled database
• 41 SNPs:– 8 encode non-synonymous changes.
• SLC24A5 (well kwon) · EDAR• PCDH15 · ADAT1• KARS · HERC1• SLC30A9 · BLFZ1
– The remaining 33 potentially functional SNPs lie within • Conserved transcriptional factors motifs• Introns• UTRs • Other non-coding regions
Results
• SLC24A5:– 600KB region– 914 genotyped SNPs– Filter application:
• 857 SNPs associated with long-haplotype signal• 233 of 867 are high-frequency derived alleles• 12 of which are highly differentiated between
populations• 5 of which are common in Europe and rare in Asia
and Africa• 1 of these 5 is only one implicated as functional
by current knowledge– Strongest signal of positive selection– Encodes A111T polymorphism associated with
pigment differences in humans.
• LCT:– 2.4Mb– 24 SNPs fulfill first two
criteria – Confer adult persistence of
lactase. – Only was identified as
functional after extensive study of the LCT gene.
Some specific cases• PS on copy number
– Expression differences exist between populations and can confer different fitness advantage and thus be positively selected.
– Therefore, positive selection can potentially act on copy number and on non-coding regions.
– AMY1: copy number is positively correlated with salivary amylase protein expression.
• Mean AMY1 copy was higher in the high-starch population
• PS on Noncoding Genomic Regions
Red triangles: previous candidates for selection (81)
Gray diamonds: newly available genome-wide empirical data set.
DiscussionWhy have many earlier results fared poorly in
genome-wide studies?
Discussion
1.- False positives and negatives2.- Ascertainment bias of data
3.- Demographic events 4.- Bias DNA repair
Bibliography