The 3 genomic paradoxes
description
Transcript of The 3 genomic paradoxes
1
The 3 genomic paradoxesThe 3 genomic paradoxes
3c
KN
C
2
KK-value paradox: Complexity -value paradox: Complexity does not correlate with does not correlate with chromosome numberchromosome number..
46 250
Ophioglossum reticulatumHomo sapiens Lysandra atlantica
~1260
3
CC-value paradox: Complexity -value paradox: Complexity does not correlate with does not correlate with ggenome sizeenome size..
3.4 109 bpHomo sapiens
6.8 1011 bpAmoeba dubia
1.5 1010 bpAllium cepa
4
NN-value paradox: Complexity -value paradox: Complexity does not correlate with gdoes not correlate with gene ene numbernumber..
~21,000 genes~21,000 genes ~25,000 genes~25,000 genes ~60,000 genes~60,000 genes
5
Possible solutions:Possible solutions:
6
What is complexity?What is complexity?
7
Solution 1 to the N-value paradox:Many protein-encoding genes Many protein-encoding genes produce more than one protein produce more than one protein product (e.g., by product (e.g., by alternative alternative splicingsplicing or by or by RNA editingRNA editing).).
8
RNA editing
Alternative Alternative splicingsplicing
9
The combinatorial use ofThe combinatorial use of RNA RNA editingediting and and alternative splicingalternative splicing probably causes the probably causes the human human proteomeproteome to beto be 5-10 times larger5-10 times larger than that of than that of DrosophilaDrosophila or or CaenorhabditisCaenorhabditis..
10
959 cells959 cells 1,031 cells1,031 cells
19,000 genes19,000 genes 13,600 genes13,600 genes~10~1088 cells cells
11
Solution 2 to the N-value paradox:We are counting the wrong We are counting the wrong things, we should count other things, we should count other genetic elements (e.g., genetic elements (e.g., smallsmall RNAsRNAs).).
12
Solution 3 to the N-value paradox:
We should look at We should look at connectivityconnectivity rather than at rather than at nodesnodes..
13
L. Mendoza and E. R. Alvarez-Buylla. 1998. Dynamics of the genetic regulatory network for Arabidopsis thaliana flower morphogenesis. J. Theor. Biol. 193:307-319.
14
Solution 4 to the N-value paradox:
The numbers provided by the The numbers provided by the various genome annotations are various genome annotations are wrong!wrong!
15
Comparison of three databses
Hogenesch JB, Ching KA, Batalov S, Su AI, Walker JR, Zhou Y, Kay SA, Schultz PG, & Cooke MP. 2001. A comparison of the Celera and Ensembl predicted gene sets reveals little overlap in novel genes. Cell 106:413-415.
16
Range of C-values in various eukaryotic taxa_______________________________________________________________Taxon Genome size range Ratio
(Kb) (highest/lowest) _______________________________________________________________Eukaryotes 2,300 - 686,000,000 298,261298,261Amoebae 35,300 - 686,000,000 19,433Fungi 8,800 - 1,470,000 167Animals 49,000 - 139,000,000 2,837Sponges 49,000 - 53,900 1Molluscs 421,000 - 5,290,000 13Crustaceans 686,000 - 22,100,000 32Insects 98,000 - 7,350,000 75Bony fishes 340,000 - 139,000,000 409Amphibians 931,000 - 84,300,000 91Reptiles 1,230,000 - 5,340,000 4Birds 1,670,000 - 2,250,000 1Mammals 1,700,000 - 6,700,000 4Plants 50,000 - 307,000,000 6,140_______________________________________________________________
17
If the variation in C-values is attributed to genes, it can be due to interspecific differences in
(1) the number of protein-coding genes
(2) the size of proteins(3) the size of protein-coding genes(4) the number and sizes of genes
other than protein-coding ones.
18
The number of protein-The number of protein-coding genes in eukaryotes coding genes in eukaryotes is thought to vary over a is thought to vary over a 50-50-foldfold range. This variation is range. This variation is insufficient to explain the insufficient to explain the 300300,,000-fold000-fold variation in variation in nuclear-DNA content. nuclear-DNA content.
19
20
TheThe biggerbigger the genome, thethe genome, the smallersmaller the genic fractionthe genic fraction
21
Nongenic DNA is the sole culprit for the C-value paradox!
99.998% 99.998%
22
MECHANISMS FOR MECHANISMS FOR GLOBALGLOBAL INCREASES INCREASES
IN GENOME SIZEIN GENOME SIZE
Genome increase:Genome increase: (1) global increases, i.e., the entire genome (1) global increases, i.e., the entire genome
or a major part of it is duplicatedor a major part of it is duplicated(2) regional increases, i.e., a particular (2) regional increases, i.e., a particular
sequence is multiplied to generate sequence is multiplied to generate repetitive DNA.repetitive DNA.
23
Polyploidization = the addition of one or more complete sets of chromosomes to the original set.
An organism with an odd number of autosomes cannotundergo meiosis orreproduce sexually.
Musa acuminataMusa acuminata
24
allopolyploidyallopolyploidy
25
Triticum urartu (AA) Aegilops speltoides (BB)
T. turgidum (AABB) T. tauschii
(DD) `
T. aestivum (AABBDD)
26
autopolyploidyautopolyploidy
27
Following Following polyploidization, a polyploidization, a very very rarappidid p process ofrocess of dudupplicatelicate-g-geneene lossloss ensues. ensues.
28
Allohexaploid Allohexaploid Triticum aestivumTriticum aestivum originated about 10,000 years ago. originated about 10,000 years ago. In this very short time, many of its In this very short time, many of its triplicated loci have been silenced. triplicated loci have been silenced.
The proportion of enzymes The proportion of enzymes produced by triplicate, duplicate, produced by triplicate, duplicate, and single loci is and single loci is 57%57%, , 25%25%, and , and 18%18%, respectively., respectively.
29
During evolution During evolution autopolyploidyautopolyploidy
&&allopolyploidyallopolyploidy
becomes becomes cryptopolyploidycryptopolyploidy..
30Genome sizes in 80 grass species (Poaceae).
31
32
33
It has been suggested that the emergence of vertebrates was made possible by two rounds of tetraploidization.
Two cryptooctoploids?
34
Does chromosome Does chromosome number increase due to number increase due to polyploidy affect the polyploidy affect the phenotype?phenotype?
Chrysanthemum species have 9 to 90chromosomes in haploid cells.
35
54 duplicated regions.
36
2 possible explanations:
(1) the duplicated regions were formed independently by regional duplications occurring at different times.
(2) the duplicated regions have been produced simultaneously by a single tetraploidization event, followed by genome rearrangement and loss of many redundant duplicates.
37
50/54 duplicated regions have maintained the same orientation with respect to the centromere.
54 independent regional duplications are expected to result in ~7 triplicated regions (i.e., duplicates of duplicates), but none was observed.
38
Loss of 92% of the duplicate genes.
Occurrence of 70-100 map disruptions.
39
Arabidopsis thaliana: regional duplications
40
What about polysomy?What about polysomy?
41
Polysomy is usually deleterious.
trisomy 21trisomy 21
42
An exception?An exception?
43
MAINTENANCE OF NONGENIC DNA: HYPOTHESES
(1) The selectionist hypothesis.(2) The neutralist hypothesis
(junk DNA). (3) The intragenomic selectionist
hypothesis (selfish DNA).(4) The nucleotypic hypothesis.
44
45
46
47
48
3.5
3
2.5
2
log nuclear volume (m3)
log DNA per cell ()1 1.5 2
Correlation between nuclear volume and nuclear DNA content in apical meristem cells of 30 herbaceous species. Regression slope = 0.826 fitted by least squares.
49
50
MAINTENANCE OF NONGENIC DNA: EVIDENCE
(1) The selectionist hypothesis.(2) The neutralist hypothesis
(junk DNA). (3) The intragenomic selectionist
hypothesis (selfish DNA).(4) The nucleotypic hypothesis.
51
Even whole chromosomes may Even whole chromosomes may be junk. be junk.
A person needs A person needs an Y, like a fish an Y, like a fish needs bicycles.needs bicycles.
52with apologies to Irina Dunn, Australian feminist (1970).
53
Nature (2004) 431:988-993.