Post on 21-Jan-2016
•..1Sources of DNA and Sequencing 1Sources of DNA and Sequencing MethodsMethods
• 2 Genome Assembly Strategy and 2 Genome Assembly Strategy and CharacterizationCharacterization
• 3 Gene Prediction and Annotation3 Gene Prediction and Annotation
• 4 Genome Structure4 Genome Structure
• 5 Genome Evolution5 Genome Evolution
• 6 A Genome-Wide Examination of 6 A Genome-Wide Examination of Sequence VariationsSequence Variations
• 7 An Overview of the Predicted Protein- 7 An Overview of the Predicted Protein- Coding Genes in the Human GenomeCoding Genes in the Human Genome
• 8 Conclusions8 Conclusions
2.91 billion bp (Cellera) 3.2 billion (Hcon)
14.8 billions were sequenced in 9 months5.1 times overlap 8 times in genes
26,588 (sure) genes for proteins + 460 genes for RNAs+ ~12,000 sequences related to mouse etc.
1.1 % exons 24% introns 75% intergenic
About 50 % repeat sequences 45% transposables elements
Average length of a gene 27,894 basesExon ~100 bp , most: 234 in titin mRNAIntron ~100-30,000 bp
2.1 million SNPs, less than 1% of them in proteinsDifferences between human genomes 1 per 1250 bp
The Human Genome (Feb. 2001)
Unknown 36%
hydrolase
isomerase
Kinase 4%
Nucleic acid enzyme 13%