The Human Genome
-
Upload
stella-dotson -
Category
Documents
-
view
110 -
download
1
description
Transcript of The Human Genome
The Human Genome
Some interesting facts
Biological system overview
Genes need to be expressed at the right time in the right place ~ 5k – 10k genes per tissue
Genes have variability, which causes a phenotype
Genes encode proteins which may be processed or modified -100k – 500k proteins
Proteins and RNAs interact in pathways and networks ~8 interactions pp
The human genome
Genome size: 3200 Mbp
http://www.ensembl.org
24 chromosomes + mitochondrion
Sequencing the genome
In 1953 James Watson and Francis Crick discovered the structure of DNA - the code of instructions for all life on earth
50 years later the human genome was sequenced by hierarchical shotgun sequencing
Sequencing the genome
The human genome was sequenced by:The International Human Genome
Sequencing ConsortiumCelera Genomics
Technique –hierarchical shotgun sequencing Draft sequences release in early 2001, but
~10% euchromatin missing and 150 000 gaps! After finishing -rereleased in 2004 with 341
gaps and covering 99% of euchromatic genome
Sequencing time period
International Human Genome Sequencing Consortium 2001. Nature 409, 860 – 921.
First human genome took ~5 years and cost ~$3 billion
Now, can sequence in a few weeks for ~$5,000
BUT: doesn’t consider cost and time for data analysis!
Size of the genome
There are 100 trillion (100,000,000,000,000) cells in your body.
There are three billion (3,000,000,000) base pairs in the DNA code within each cell.
The genome requires more than 3 gigabytes of computer storage space
Full genome done by NGS costs $100/genome per year to store
http://www.pbs.org/wgbh/nova/genome/facts.html
Interesting facts If all the DNA in your body was put end to end, it
would reach to the sun and back over 600 times (100 trillion times six feet/92 million miles).\
If unwound and tied together, the strands of DNA in one cell would stretch almost six feet but would be only 50 trillionths of an inch wide.
It would take a person typing 60 words per minute, eight hours a day, around 50 years to type the human genome.
If all three billion letters in the human genome were stacked one millimeter apart, they would reach a height 7,000 times the height of the Empire State Building.
http://www.pbs.org/wgbh/nova/genome/facts.html
Some statistics
Only 1.5% of genome is coding Other non-protein coding sequence is for other
kinds of “genes” or “lost genes” A proportion of our genome is not our own!
50% repeat regions, most of viral origin!single most common protein is the "recipe" for
making Reverse Transcriptase 99.9% of our sequences are identical
Number of human genes First estimates of between 20 000 and 150 000
genes Seems to be between 20 000 and 30 000 genes Expansion of the number of different protein
molecules due to: (a) alternative splicing (30 to 50% increase); (b) post-translational modifications (5 to 10 fold
increase) There could be about 1 million different
protein molecules in the human body
Gene numbers
2000-5000 genes
24000 genes
6000 genes
19000 genes
14000 genes
22000
21000
Latest genome build
Known protein-coding genes: 20,442 Novel protein-coding genes: 434 Pseudogenes: 15,007 RNA genes: 12,523 Gene exons: 649,964 Gene transcripts: 181,744
Protein coding genes
Many of the genes are alternatively spliced Human genes have short exons (50
codons) and long introns (10k) Average gene length is 3000bp, max is 2.4
mill We know the function of less than half of
all the genes
Comparative genomics
Comparing the human genome to others:
Organism Genome size (Mbp)
No. of genes
Human 3000 21,000
Mouse 2800 22,000
Fruit fly 180 14,000
Worm 97 19,000
Yeast 12 6000
Evolution of humans
Genes in common with other organisms
About 75% of human genes have non-human homologues, ~70% match mouse proteins
International Human Genome Sequencing Consortium 2001. Nature 409, 860 – 921.
Functional composition
International Human Genome Sequencing Consortium 2001. Nature 409, 860 – 921.
Humans have more
multifunctional genes, and genes involved in cell-cell
communication and signalling
Human genome resources
Ensembl UCSC Genome Browse OMIM –human genes and inherited
disorders dbSNP -single nucleotide polymorphisms Genetic Map at NCBI Etc.
http://www.ncbi.nlm.nih.gov