BACTERIA & VIRUSES. BACTERIA PROKARYOTIC in 2 of 3 Domains 1. Eubacteria 2. Archaebacteria.
Giant Viruses and Domains of (Cellular) Life: Two ... Koonin.pdf · n Giant Viruses and Domains of...
Transcript of Giant Viruses and Domains of (Cellular) Life: Two ... Koonin.pdf · n Giant Viruses and Domains of...
Nati
on
al C
en
ter
for
Bio
tech
no
log
y In
form
ati
on
Giant Viruses and Domains of
(Cellular) Life: Two
fundamentally different
histories and modes of
evolution
Eugene V. Koonin
National Center for Biotechnology Information, NLM,
NIH
DNA Habitats and their RNA Inhabitants
July 3, 2014, Salzburg
Nati
on
al C
en
ter
for
Bio
tech
no
log
y In
form
ati
on
Some viruses are comparable to cellular life forms in
size and genetic complexity
Mimivirus genome (~1.2 Mbp, ~1,000 genes) is twice as large
as that of Mycoplasma genitalium (580 kbp; ~500 genes)
The largest, most complex viruses:
NCLDV (Nucleo-Cytoplasmic Large DNA viruses of eukaryotes)
Raoult et al. Science. 2004 Nov 19;306(5700):1344;
Suzan-Monti et al. PLoS ONE. 2007 Mar 28;2(3)
Nati
on
al C
en
ter
for
Bio
tech
no
log
y In
form
ati
on
Science 19 July 2013:
Vol. 341 no. 6143 pp. 281-286
Pandoraviruses: Amoeba Viruses
with Genomes Up to 2.5 Mb Reaching
That of Parasitic Eukaryotes
Nadège Philippe, Matthieu Legendre, Gabriel Doutre, Yohann Couté,
Olivier Poirot, Magali Lescot, Defne Arslan, Virginie Seltzer, Lionel Bertaux,
Christophe Bruley, Jérome Garin, Jean-Michel Claverie, Chantal Abergel
Nati
on
al C
en
ter
for
Bio
tech
no
log
y In
form
ati
on
Legendre et al. Thirty-thousand-year-old distant relative of giant icosahedral DNA viruses with a pandoravirus morphology. PNAS 2014
…third type of giant virus combining an even larger pandoravirus-like particle 1.5 μm in length with a surprisingly
smaller 600 kb AT-rich genome, a gene content more similar to Iridoviruses and Marseillevirus, and a fully
cytoplasmic replication reminiscent of the Megaviridae
Nati
on
al C
en
ter
for
Bio
tech
no
log
y In
form
ati
on
• Iyer LM, Aravind L, Koonin EV. Common origin of four diverse families of
large eukaryotic DNA viruses. J Virol. 2001 Dec;75(23):11720-34
Nucleo-Cytoplasmic DNA Viruses (NCLDV)
recognized before the discovery of giant viruses
Giant viruses
incorporated later on
Iyer LM, Balaji S, Koonin EV, Aravind L. Evolutionary genomics of nucleo-cytoplasmic large
DNA viruses. Virus Res. 2006 Apr;117(1):156-84
Nati
on
al C
en
ter
for
Bio
tech
no
log
y In
form
ati
on
Arch Virol. 2013 Jun 29.
Megavirales", a proposed new order for eukaryotic
nucleocytoplasmic large DNA viruses. Colson P, De Lamballerie X, Yutin N, Asgari S, Bigot Y, Bideshi DK, Cheng XW,
Federici BA, Van Etten JL, Koonin EV, La Scola B, Raoult D.
The nucleocytoplasmic large DNA viruses (NCLDVs) comprise a
monophyletic group of viruses that infect animals and diverse unicellular
eukaryotes. The NCLDV group includes the families Poxviridae,
Asfarviridae, Iridoviridae, Ascoviridae, Phycodnaviridae, Mimiviridae and
the proposed family "Marseilleviridae". The family Mimiviridae includes the
largest known viruses, with genomes in excess of one megabase, whereas
the genome size in the other NCLDV families varies from 100 to 400
kilobase pairs. Most of the NCLDVs replicate in the cytoplasm of infected
cells, within so-called virus factories. The NCLDVs share a common
ancient origin, as demonstrated by evolutionary reconstructions that
trace approximately 50 genes encoding key proteins involved in viral
replication and virion formation to the last common ancestor of all
these viruses. Taken together, these characteristics lead us to propose
assigning an official taxonomic rank to the NCLDVs as the order
"Megavirales", in reference to the large size of the virions and genomes of
these viruses.
Nati
on
al C
en
ter
for
Bio
tech
no
log
y In
form
ati
on
• Do giant viruses represent a 4th (5th etc) domain(s)
of life?
• Origin of the giant virus genes?
• Evolutionary relationships between giant viruses
and other Megavirales
• Origins of the Megavirales
Nati
on
al C
en
ter
for
Bio
tech
no
log
y In
form
ati
on
Giant viruses: The 4th domain of
Life?
More precisely: highly derived
descendants of a 4th domain of
cellular life?
Phylogenies of “universal cellular
genes” encoded in NCLDV
genomes
Yutin, Wolf, Koonin, Virology 2014, in press
Nati
on
al C
en
ter
for
Bio
tech
no
log
y In
form
ati
on
Phylogeny of the mimivirus: the 4th domain of life?
Raoult et al.
Science 2004
The finding of numerous virally encoded components of an incomplete translation apparatus
strongly suggested a process of reductive evolution from an even more complex ancestor that
was endowed with protein synthetic capability. Such an ancestor could either have evolved
from an obligate intracellular parasitic cell (functionally similar to Rickettsia or Chlamydia),
or be derived from the nucleus of a primitive eukaryote.
Claverie, Genome Biol. 2006, 7:110
The tree of our discontent
Nati
on
al C
en
ter
for
Bio
tech
no
log
y In
form
ati
on
Colson P, de Lamballerie X, Fournous G, Raoult D.
Reclassification of giant viruses composing a fourth domain of life in the new order Megavirales.
Intervirology. 2012;55(5):321-32
Legendre M, Arslan D, Abergel C, Claverie JM.
Genomics of Megavirus and the elusive fourth domain of Life.
Commun Integr Biol. 2012 Jan 1;5(1):102-6
Williams TA, Embley TM, Heinz E.
Informational gene phylogenies do not support a fourth domain of life
for nucleocytoplasmic large DNA viruses. PLoS One. 2011;6(6):e21080
Colson P, Gimenez G, Boyer M, Fournous G, Raoult D.
The giant Cafeteria roenbergensis virus that infects a widespread marine phagocytic protist
is a new member of the fourth domain of Life. PLoS One. 2011 Apr 29;6(4):e18935
Nasir A, Kim KM, Caetano-Anolles G.
Giant viruses coexisted with the cellular ancestors and represent a distinct supergroup
along with superkingdoms Archaea, Bacteria and Eukarya. BMC Evol Biol. 2012 Aug 24;12:156
Boyer M, Madoui MA, Gimenez G, La Scola B, Raoult D.
Phylogenetic and phyletic studies of informational genes in genomes highlight existence of
a 4 domain of life including giant viruses. PLoS One. 2010 Dec 2;5(12):e15530
The 4th domain lore
…and the solitary voice of dissent
Nati
on
al C
en
ter
for
Bio
tech
no
log
y In
form
ati
on
Translation proteins in giant viruses
Phylogenomics protocol
• All proteins from the following groups were retrieved from GenBank: Pandoravirus, Mimiviridae, Phycodnaviridae, C. roenbergensis virus, P. globosa viruses, Organic Lake phycodnaviruses, Marseillevirus, and Lausannevirus.
• All the proteins were used as queries for rpsblast against COG and KOG profiles (/net/nabl000/vol/blast/db/blast/oasis_cog and /net/nabl000/vol/blast/db/blast/oasis_kog).
• Viral proteins with significant (e < 0.01) hits to COGs or KOGs assigned to Translation category (J) were collected and manually checked for false positives.
• Alignment: Muscle/low information content position filtering
• Tree construction: ML/TreeFinder
• Statistical tests of tree topology: AU, ELW
Falsification of the 4th domain hypothesis:
• Giant viruses within one of the 3 domains-
incompatible with 4th domain hypothesis
• Giant viruses outside the 3 domains – compatible with
4th domain hypothesis
Nati
on
al C
en
ter
for
Bio
tech
no
log
y In
form
ati
on
Eukaryotic RNAp II
OLPG
African swine fever virus 9628206
E8 Phytophthora parasitica 568012326
Cafeteria roenbergensis virus 310831358
Mimiviridae
Pandoravirus salinus 516306301
Pandoravirus dulcis 526119044
Emiliania huxleyi virus 86 73852534
Marseilleviridae
Iridoviridae
Pithovirus sibericum 585299695_585299479
Entomopoxvirinae
Chordopoxvirinae
Eukaryotic RNAp III
Archaea
99
100
100
100
60
89
100
99
100 97
100
95
100
99
98
94
100
100
56
100
96
0.2
A
99
Iridoviridae
Pithovirus sibericum 585299571
E7 Galdieria sulphuraria 545713857
El Monosiga brevicollis r167537729
El Oikopleura dioica 313246967
Entomopoxvirinae
Chordopoxvirinae
Eukaryotic RNAp III
Archaea
63
Eukaryotic RNAp II
Mimiviridae
Cafeteria roenbergensis virus 310831213
African swine fever virus 9628161
E8 Phytophthora parasitica 570975649
OLPG
100 99
100
50
82
98
61
73
92
100
100
100
100
94
75
100
100
76
0.2
Marseilleviridae 100
Pandoravirus salinus 516306305
Pandoravirus dulcis 526119221
Emiliania huxleyi virus 86 73852908
100
100
99
B
100
RNA polymerase subunits
No 4th domain
Nati
on
al C
en
ter
for
Bio
tech
no
log
y In
form
ati
on
Translation proteins in giant viruses
KOG/COG annotation Aca
nth
amo
eb
a p
oly
ph
aga
mim
ivir
us
Aca
nth
amo
eba
cast
ella
nii
mam
avir
us
Aca
nth
amo
eba
po
lyp
hag
a le
nti
llevi
rus
Aca
nth
amo
eb
a p
oly
ph
aga
mo
um
ou
viru
s
Mo
um
ou
viru
s
Mo
um
ou
viru
s M
on
ve
Mo
um
ou
viru
s go
ule
tte
Co
urd
o1
1 v
iru
s
Terr
a1 v
iru
s
Me
gavi
rus
chil
ien
sis
Meg
avir
us
cou
rdo
7
Meg
avir
us
cou
rdo
11
Caf
ete
ria
roe
nb
erg
en
sis
viru
s B
V-P
W1
OLP
V
OLP
V1
OLP
V2
Pgl
ob
16
Pgl
ob
14
Pgl
ob
12
Mar
seill
evir
us
Lau
san
nev
iru
s
Pan
do
ravi
rus
sali
nu
s
Pan
do
ravi
rus
du
lsis
Aca
nth
ocy
stis
tu
rfac
ea C
hlo
rella
vir
us
Par
amec
ium
bu
rsar
ia C
hlo
rella
vir
us
COG0018 Arginyl-tRNA synthetase Y Y Y Y Y Y Y Y Y Y
COG0017 Aspartyl/asparaginyl-tRNA synthetase Y Y Y Y Y
COG0215 Cysteinyl-tRNA synthetase Y Y Y Y Y Y Y Y Y Y
COG0060 Isoleucyl-tRNA synthetase Y Y Y Y Y Y
COG0143 Methionyl-tRNA synthetase Y Y Y Y Y Y Y Y Y
COG0180 Tryptophanyl-tRNA synthetase Y Y Y
COG0162 Tyrosyl-tRNA synthetase Y Y Y Y Y Y Y Y Y Y Y Y Y Y
COG0130 Pseudouridine synthase Y
COG1503 Peptide chain release factor 1 (eRF1) Y Y Y Y Y Y Y
COG0251 Putative translation initiation inhibitor, yjgF family Y Y Y
COG0009 Putative translation factor (SUA5) Y
KOG0062 translation elongation factor eEF3 Y Y Y Y Y
COG5256 Translation elongation factor EF-1alpha (GTPase) Y Y
COG0023 Translation initiation factor 1 (eIF-1/SUI1) Y Y Y Y Y Y Y Y Y Y
KOG3403 Translation initiation factor 1A (eIF-1A) Y
COG0532 Translation initiation factor 2 (IF-2; GTPase) Y
COG1093 Translation initiation factor 2, alpha subunit (eIF-2alpha) Y
COG1601 Translation initiation factor 2, beta subunit (eIF-2beta) Y Y
COG5257 Translation initiation factor 2, gamma subunit (eIF-2gamma) Y
KOG0122 Translation initiation factor 3, subunit g (eIF-3g) Y
KOG1670 Translation initiation factor 4F, cap-binding subunit (eIF-4E) Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y
KOG0327 Translation initiation factor 4F, helicase subunit (eIF-4A) Y Y Y Y Y Y
Giant virus proteins involved in translation
Nati
on
al C
en
ter
for
Bio
tech
no
log
y In
form
ati
on
Tyrosyl-tRNA synthetase
Pandoravirus dulcis 526120083
Pandoravirus salinus 531037240
Ea Acanthamoeba castellanii 470515426
Opisthokonts
Opisthokonts
El Aplysia californica 524869086
Ae Haloarcula marismortui 55377417
Az Nanoarchaeota archaeon SCGC AAA011-L22 516995098
Archaea
Bx uncultured bacterium 406996170
Mimiviridae
Ea Entamoeba histolytica 67468336
Ea Entamoeba nuttalli 407034436
Ew Trichomonas vaginalis 123444741
E2 Giardia lamblia 159111323
Ec chthyophthirius multifiliis 471222076
Ec Oxytricha trifallax 403360179
Eh Guillardia theta 428177126
Ek Trypanosoma cruzi 71657803
Eq Naegleria gruberi 290994863
Viridiplantae
Ciliata
Ac unclassified candidate division pSL4516974592
Archaea
Archaea 97
53
52
93
65
100 100
83
84
100
100
100
60
99
64
99 59
99
80
99
99
51
97
53
62
75
65
98
93
0.5
No 4th domain. Polyphyletic giant viruses
Nati
on
al C
en
ter
for
Bio
tech
no
log
y In
form
ati
on
Arginyl-tRNA synthetase
Mimiviridae
El Monosiga brevicollis 167538705
El Xenopus laevis 66912037
El Hydra vulgaris 449663743
E2 Giardia lamblia 159116602
Ea Dictyostelium discoideum 66806687
Eq Naegleria gruberi 290977172
E9 Volvox carteri 302854025
E9 Physcomitrella patens 168014459
E8 Blastocystis hominis 300122665
Eh Guillardia theta 428179188
Ec Paramecium tetraurelia 145481313
Ec Theileria annulata 84996665
E8 Phytophthora infestans 301114575
Bv Parachlamydia acanthamoebae 338176601
Bv Chlamydia trachomatis 7674345
E8 Thalassiosira pseudonana 223997646
Bc Pseudanabaena sp. PCC7367 428219431
Bc Nostoc sp. PCC7120 17231209
Bc Synechocystis sp. PCC6803 2501048
Bc Synechococcus sp. PCC7002 170077224
Bacteria, Archaea
65
75
97
92
99
97
76
100
74
56
86
77
80
71
60
99
79
97
89
89
0.2
No 4th domain
Nati
on
al C
en
ter
for
Bio
tech
no
log
y In
form
ati
on
Aspartyl/asparaginyl-tRNA synthetase
Bacteria
Eukaryotes (protista)
Bacteria
Bp Bacteriovorax marinus 374288536
Bb Flavobacteriaceae bacterium 255535963
Bs Borrelia garinii 490931077
Bacteria
El Schizosaccharomyces pombe 19111871
El Saccharomyces cerevisiae 135160
Mimiviridae
Ac Pyrobaculum aerophilum 18313730
Ae Thermoplasma volcanium 13541827
Eukaryotes (Opisthokonts)
Archaea
Bacteria
96
63
99
99
100
98
100
100
59
91
100
64
91
89
88
85
83
81
0.2
mitochondrial
No 4th domain (complex history)
Nati
on
al C
en
ter
for
Bio
tech
no
log
y In
form
ati
on
Cysteinyl-tRNA synthetase
E8 Blastocystis hominis 300121951
Ea Acanthamoeba castellanii 470527521
Ea Polysphondylium pallidum 281205872
Eq Naegleria gruberi 290977210
E8 Albugo laibachii 325185588
Mimiviridae
E8 Thalassiosira pseudonana 224011277
E7 Galdieria sulphuraria 452818686
Opisthokonts
Eukaryota
Bacteria, Archaea
91
64
52
68
60
100
52
76
99
78
100
0.2
No 4th domain
Nati
on
al C
en
ter
for
Bio
tech
no
log
y In
form
ati
on
Methionyl-tRNA synthetase
El Edhazardia aedis 402469545
El Encephalitozoon cuniculi 19074889
El Nosema ceranae 300707442
El Enterocytozoon bieneusi 269864707
Mimiviridae
Eq Naegleria gruberi 291001311
El Amphimedon queenslandica 340369559
El Monosiga brevicollis 167519384
El Salpingoeca rosetta 514688607
El Podospora anserina 171695362
Eukaryotes
Bacteria, Archaea
98
82
72
100
100
99
70
99
95
75
85
0.2
No 4th domain
Nati
on
al C
en
ter
for
Bio
tech
no
log
y In
form
ati
on
Tryptophanyl-tRNA synthetase
El Capsaspora owczarzaki 470362488
El Monosiga brevicollis 167523649
E8 Phytophthora infestans 301100844
Ea Dictyostelium discoideum 66825257
Ec Perkinsus marinus 294948890
Eh Guillardia theta 428184198
E9 Selaginella moellendorffii 302816915
Eh Guillardia theta 428178144
Ek Leishmania infantum 146087052
El Pseudocercospora fijiensis 452980631
E8 Phaeodactylum tricornutum 219129493
Ec Oxytricha trifallax 403366768
Ec Plasmodium cynomolgi 457873441
Ek Leishmania major 389593663
Bacteria, Archaea
93
96
63
53
67 63
66
67
75
70
60
E2 Giardia lamblia 159116851
Ew Trichomonas vaginalis 123413422
Megavirus chiliensis 350610966
E7 Chondrus crispus 546317932
E7 Cyanidioschyzon merolae 544210914
Ep Emiliania huxleyi 485642274
Ea Entamoeba dispar 167380825
El Puccinia graminis 403162477
El Hydra vulgaris 221091188
Ea Acanthamoeba castellanii 470392665
Ea Entamoeba histolytica 254839587 Pandoravirus salinus 516305589
95 92
74 74
84
86
55
99
67
52
99
0.2
No 4th domain
Nati
on
al C
en
ter
for
Bio
tech
no
log
y In
form
ati
on
Isoleucyl-tRNA synthetase
Cafeteria roenbergensis virus 310831495
Mimiviridae
Eukaryotes
Bp Marinomonas sp. MWYL1 152997852
Bv Chlamydophila pecorum 330444673
Bp Plesiocystis pacifica 494033478
Bi Fibrobacter succinogenes 261416197
Bp SAR324 cluster bacterium JCVI-SC AAA005 497835314
Ba actinobacterium SCGC AAA015-D07518863110
Bs Treponema pallidum 384421972
Bacteria
Bacteria, Archaea 100
69
74
92
52
99
100
98
60
95
59
99
83
0.2
Compatible with 4th domain
Nati
on
al C
en
ter
for
Bio
tech
no
log
y In
form
ati
on
Translation elongation factor EF-1alpha (GTPase)
Ea Dictyostelium discoideum 66816687
Ea Dictyostelium fasciculatum 545292632
Ea Polysphondylium pallidum 281209260
Cannes 8 virus 539398683
Marseillevirus 284504191
Lausannevirus 327409596
Ea Entamoeba histolytica 67468316
Eq Naegleria gruberi 290977178
Ciliata
Ep Emiliania huxleyi 551537288
Viridiplantae
Stramenopiles
Ciliata
Ea Acanthamoeba castellanii 470510675
Opisthokonts
Rhodophyta
Eh Guillardia theta 551658185
Eukaryotes (protista)
71
77
84
99
98
74
66
99
99
100
89
67
71 85
92
70
92
62
75
63
97
0.2
Marseilleviridae
Amoebozoa
No 4th domain
Nati
on
al C
en
ter
for
Bio
tech
no
log
y In
form
ati
on
Translation initiation factor 1 (eIF-1/SUI1)
Mimiviridae
Ew Trichomonas vaginalis 123505131
Ec Paramecium tetraurelia 145514171
Ea Entamoeba invadens 471192049
Ea Dictyostelium fasciculatum 470265692
El Trichoderma reesei 340519630
Eq Naegleria gruberi 290994913
Marseillevirus 284504111
Ea Acanthamoeba castellanii 470388596
Eukaryores (protista)
Ec Cryptosporidium parvum 66358274
Ec Oxytricha trifallax 403334410
Ec Perkinsus marinus 294936199
E9 Ostreococcus tauri 308813287
Ek Trypanosoma cruzi 71653671
Cafeteria roenbergensis virus 310831052
Eh Guillardia theta162605858
Ea Dictyostelium discoideum 66814828
El Schizophyllum commune 302696119
Bacteria, Archaea
64
99
67
64
88
54
99
54
59
69
75
54
62
69
60
0.2
No 4th domain, polyphyletic Megavirales
Nati
on
al C
en
ter
for
Bio
tech
no
log
y In
form
ati
on
Translation initiation factor 2, beta subunit (eIF-2beta)/ eIF-5 N-terminal domain E2 Giardia lamblia 159118941
Ea Dictyostelium purpureum 330796278
El Edhazardia aedis 402466782
Ew Trichomonas vaginalis 123464050
El Nosema ceranae 300708033 Cafeteria roenbergensis virus 310830993
Cafeteria roenbergensis virus 310831054
El Spraguea lophii 523781078
El Trachipleistophora hominis 440491859
Ea Acanthamoeba castellanii 470444266
Ea Acanthamoeba castellanii 470453140 Cannes 8 virus 539398832
Marseillevirus 284504323
Eh Guillardia theta 551642079
Ek Angomonas deanei 528242684
Eukaryotes
Ea Entamoeba invadens 471198136
E9 Selaginella moellendorffii 302786804
El Nematocida parisii 387594226
Abalone herpesvirus 410493455
E2 Guillardia theta 159107925
Ek Leishmania infantum 146077959
Ew Trichomonas vaginalis 154418648 Cafeteria roenbergensis virus 310831042
Ea Entamoeba histolytica 67482037
Eh Guillardia theta 162606164
E7 Chondrus crispus 507112198
Eq Naegleria gruberi 291001943
Eukaryotes (protista)
Ea Acanthamoeba castellanii 470467599
Ae Methanopyrus kandleri 20094403
Archaea
Archaea 76
63
58
66
78
58
99
56
57
65
100
79
71
85
100 50
89
50
69
52
83
63
59
0.5
Marseillevirus family
100
No 4th domain, polyphyletic Megavirales
Nati
on
al C
en
ter
for
Bio
tech
no
log
y In
form
ati
on
Translation initiation factor 4F, helicase subunit (eIF-4A), and related helicases
0.2
Mimiviridae
Cafeteria roenbergensis virus 310831360
Ec Paramecium tetraurelia 145517226
El Encephalitozoon cuniculi 85014351
E2 Giardia lamblia 159117719
Ea Entamoeba histolytica 67484120
El Caenorhabditis elegans 17553716
Diachasmimorpha longicaudata entomopoxvirus 51317205
Eukaryotes
Eukaryotes
Eukaryotes
Bacteria
86
100
79
83
81
72
61
83
81
100 87
61
100
85
0.2
No 4th domain
Nati
on
al C
en
ter
for
Bio
tech
no
log
y In
form
ati
on
0.2
Bacteria (firimicutes)
El Capsaspora owczarzaki 514484900
Bf Exiguobacterium sp. MH3 557829340
Pandoravirus salinus 531037074
Pandoravirus dulcis 526118915
Bf Listeria ivanovii 313626839
Bf Halobacillus sp. BAB-2008 495907737
Bf Brevibacillus borstelensis 489483458
Bacteria (firimicutes)
Bf Alicyclobacillus acidocaldarius 258512736
E7 Galdieria sulphuraria 545699801
Ec Perkinsus marinus 294940148
E7 Chondrus crispus 546324117
Eh Guillardia theta 551646953
Ba Xylanimonas cellulosilytica 269957074
El Encephalitozoon romaleae 396082165
El Vittaforma corneae 429963321
Bacteria (firimicutes)
Bj Acholeplasma brassicae 549143423
Bp Acidiphilium sp. CAG:727 548199224
Bacteria (firimicutes)
Bacteria (firimicutes)
Archaea, Bacteria
Bacteria (firimicutes)
Ea Acanthamoeba castellanii 470396822
Eukaryotes (Opisthokonts)
Bacteria (firimicutes)
Bacteria
81
71
67
71
72
95
100
50
59
69
65
70
57 97
71
59
69
87
97
56 57
100
87
80
65
73
81
93
84
96
0.1
Putative translation factor SUA5
No 4th domain
Nati
on
al C
en
ter
for
Bio
tech
no
log
y In
form
ati
on
0.2
Peptide chain release factor 1 (eRF1) E2 Giardia lamblia 159119564
Ek Trypanosoma brucei 15213996
Ew Trichomonas vaginalis 123468525
Ec Euplotes octocarinatus 10944788
Ec Tetmemena pustulata 24711338
Ec Paramecium tetraurelia 145478445
Ec Paramecium tetraurelia 145524631
Ec Tetrahymena thermophila 118371070
Ea Entamoeba invadens 471200149
El Meyerozyma guilliermondii 146412598
Eh Cryptomonas paramecium 330040352
Eh Hemiselmis andersenii 160331093
Eh Guillardia theta 162606186
Ea Acanthamoeba castellanii 470399441
El Enterocytozoon bieneusi 269860702
El Nematocida parisii 387593404
Marseillevirus family
Mimiviridae
Archaea 98
100
53
72
79
100
98
88 58
66
68
98
60
66
56
91
72
97
0.2
Compatible with 4th domain
Nati
on
al C
en
ter
for
Bio
tech
no
log
y In
form
ati
on
• Do giant viruses represent a 4th (5th etc) domain(s)
of life?
• Origin of the giant virus genes?
• Evolutionary relationships between giant viruses
and other Megavirales
• Origins of the Megavirales
Nati
on
al C
en
ter
for
Bio
tech
no
log
y In
form
ati
on
0 50 100 150 200 250 300
Acanthamoeba polyphaga mimivirus
Megavirus chiliensis
Cafeteria roenbergensis virus BV-PW1
Organic Lake phycodnavirus 1
Phaeocystis globosa virus
Pandoravirus salinus
Pithovirus sibericum
Bacteria
NCLDV
other Viruses
Archaea
Amoebozoa
other Eukaryota
Phylogenomics of giant viruses: where do all these genes come from?
• Phylogenomics pipeline with origin assignment
• 1292 trees
• Distinct phylogenomic landscapes for different virus groups
• Substantial bacterial contribution
Nati
on
al C
en
ter
for
Bio
tech
no
log
y In
form
ati
on
• Do giant viruses represent a 4th (5th etc) domain(s)
of life?
• Origin of the giant virus genes?
• Evolutionary relationships between giant viruses
and other Megavirales
• Origins of the Megavirales
Nati
on
al C
en
ter
for
Bio
tech
no
log
y In
form
ati
on
Heliothis virescens ascovirus 3e
Spodoptera frugiperda ascovirus 1a
Trichoplusia ni ascovirus 2c
Invertebrate iridescent virus 6
Invertebrate iridescent virus 3
Wiseana iridescent virus
Infectious spleen and kidney necrosis virus
Lymphocystis disease virus – isolate China
Frog virus 3
Singapore grouper iridovirus
Pithovirus sibericum
Lausannevirus
Marseillevirus
Pandoravirus dulcis
Pandoravirus salinus
Emiliania huxleyi virus
Ectocarpus siliculosus virus
Feldmannia species virus
Acanthocystis turfacea Chlorella virus
Paramecium bursaria Chlorella virus NYs1
Bathycoccus sp. RCC1105 virus BpV1
Micromonas sp. RCC1109 virus MpV1
Micromonas pusilla virus SP1
Ostreococcus lucimarinus virus OlV1
Ostreococcus tauri virus
Organic Lake phycodnavirus 1
Organic Lake phycodnavirus 2
Phaeocystis globosa virus
Cafeteria roenbergensis virus BV-PW1
Acanthamoeba polyphaga mimivirus
Acanthamoeba polyphaga moumouvirus
Megavirus chiliensis
African swine fever virus
Amsacta moorei entomopoxvirus L
Mythimna separata entomopoxvirus L
Anomala cuprea entomopoxvirus
Melanoplus sanguinipes entomopoxvirus
Canarypox virus
Nile crocodilepox virus
Molluscum contagiosum virus subtype 1
Orf virus
Squirrelpox virus
Vaccinia virus
Myxoma virus
Yaba-like disease virus 0.5
Ascoviridae
Iridoviridae
Marseilleviridae
Phycodnaviridae
extended Mimiviridae
Poxviridae
Asfarviridae
92
+44 -20
50
+6 -14
103
+40 -6
68
+13 -3
58
+4 -9
143+184
+80+184 -28 246
+167 -11
91
+31 -3
63
+6 -3
575
+545 -33
98+310
+56+310 -22
64
+1 -0
88
+42 -23
98
+37 -8
63
+7 -3
174
+74 -12
200+247
+41+247 -18 655
+511 -33
177
+76 -12
112
+56 -4
44
73+45
+40+45 -12
72
+33 -6
Phylogeny of 6 nearly universal genes of Megairales
ML reconstruction of gene
gain/loss
giant
Nati
on
al C
en
ter
for
Bio
tech
no
log
y In
form
ati
on
• Phylogenetic analysis of universal genes
combined with ML reconstruction of gene
gain-loss
• Independent origin of 3 groups of giant viruses
associated with massive gene gain in each
case
Nati
on
al C
en
ter
for
Bio
tech
no
log
y In
form
ati
on
Wolf YI, Koonin EV. Genome reduction as the dominant mode of evolution. Bioessays. 2013 Sep;35(9):829-37
Do large viruses reverse the dominant trend of cellular
evolution:
• Expansive instead of reductive evolution?
Nati
on
al C
en
ter
for
Bio
tech
no
log
y In
form
ati
on
Genomic “accordion”
Filée J. Route of NCLDV evolution: the genomic accordion.
Curr Opin Virol. 2013 Oct;3(5):595-9
Origin of the giant viruses: Whither the 4th domain?
• Giant viruses (mimiviruses , pandoraviruses and pithoviruses) belong to three distinct branches of Megavirales
• Among Megavirales, only mimi and pandora have genomes that are “cell-like” in size and encompass some universal cellular genes (e.g. translation system components)
• 4th domain scenario implies multiple additional losses of numerous genes in other branches of Megavirales
• All Megavirales share viral hallmark genes: helicase-primase, jelly roll capsid, packaging ATPase
• Megavirales emerged from the virus world
• Most of the universal cellular genes in giant viruses are of eukaryotic origin
• These universal genes of giant viruses are polyphyletic – old acquisitions in mimiviruses but recent acquisitions from amoeba in Pandoraviruses
• No evidence of origin of giant viruses from a distinct domain of cellular life
Virus Empire Cellular Empire
(+)RNA viruses
dsRNA viruses
(-)RNA viruses
Retroviruses and elements
ssDNA viruses and plasmids
dsDNA viruses and transposons
Viroids
Archaea
Bacteria
Eukaryota
Cellular
domains
Viral/selfish
domains
Koonin, Dolja,
COVIRO 2013
Nati
on
al C
en
ter
for
Bio
tech
no
log
y In
form
ati
on
• Do giant viruses represent a 4th (5th etc) domain(s)
of life?
• Origin of the giant virus genes?
• Evolutionary relationships between giant viruses
and other Megavirales
• Origins of the Megavirales
Nati
on
al C
en
ter
for
Bio
tech
no
log
y In
form
ati
on
Kapitonov VV, Jurka J. Self-synthesizing DNA transposons in eukaryotes.
Proc Natl Acad Sci U S A. 2006 Mar 21;103(12):4540-5
• Origin of the Megavirales…and other DNA viruses of eukaryotes
• Polintoviruses as the hotbed of eukaryotic DNA virus evolution
Polintons/Mavericks: the largest eukaryotic transposons
Most of the polintons are actually polintoviruses
Krupovic M, Bamford DH, Koonin EV.
Conservation of major and minor jelly-roll capsid proteins in Polinton (Maverick) transposons suggests that they are
bona fide viruses. Biol Direct. 2014 Apr 29;9(1):6
Polintovirus-centered network of evolutionary connections
Large DNA viruses/transposons span the tree of eukaryotes
History of double-beta-barrel capsid proteins
Phylogeny of protein-primed polymerases
Krupovic, Koonin, submitted
Natural history of the DNA viruses of eukaryotes
Nati
on
al C
en
ter
for
Bio
tech
no
log
y In
form
ati
on
Natalia Yutin, NCBI
Kira Makarova, NCBI
Patrick Forterre, Inst Pasteur
David Prangishvili, Inst Pasteur
Dennis Bamford, Univ. Helsinki
Yuri Wolf, NCBI Mart Krupovic, Inst. Pasteur
Didier Raoult
and lab Univ. Aix-Marseille
Nati
on
al C
en
ter
for
Bio
tech
no
log
y In
form
ati
on
Evolutionary Genomics Research
Group at the NCBI
http://www.ncbi.nlm.nih.gov/research/groups/koonin/