Bacteriophage DNA Polymerases from Thermal Aquifers 1000 nm David Mead1, Vinay Dhodda1, Robert...
-
Upload
nguyenkhuong -
Category
Documents
-
view
212 -
download
0
Transcript of Bacteriophage DNA Polymerases from Thermal Aquifers 1000 nm David Mead1, Vinay Dhodda1, Robert...
K 1000 nm
David Mead1, Vinay Dhodda1, Robert DiFrancesco1, Melodee Patterson1, Ronald Godiska1, Mya Breitbart2, Forest Rohwer2, Mark Young3, Paul Richardson4, Thomas Schoenfeld1 1Lucigen Corporation, Middleton, WI; 2San Diego State University, San Diego, CA; 3Montana State University, Bozeman, MT; 4Joint Genome Institute, Walnut Creek, CA
Bacteriophage DNA Polymerases from Thermal Aquifers
Waterborne viruses (phages) in thermal springs have been largely unexamined, despite their potential importance in the biosphere. Phages promote microbial diversity by predation of the most abundant microbes and by transfer of genes through transduction and lysogeny. Sequence analysis (>37,000 reads) of thermal aquifer phage libraries provides insight into viral complexity, lifestyles, molecular diversity, and access to coding sequences for useful proteins. As expected, a large fraction of identifiable sequences encode phage specific proteins such as nucleic acid metabolizing enzymes, phage structural proteins, lytic enzymes, and mobile elements such as plasmids and transposons. Numerous similarities to enzymes associated with temperate phage, such as integrases, suggest that lysogeny is common in thermal environments. Unexpectedly, a Thermosynechococcus-like photosynthesis regulatory gene was found next to an integrase gene. Over 200 DNA polymerase genes have been identified with approximately 58 being full length. We have expressed 10 active enzymes, one of which allows isothermal amplification at elevated temperatures with greater specificity than conventional polymerases. This subset of thermostable phage DNA polymerases appears much more diverse than known microbial or phage enzymes. The fact that no pol genes were re-isolated suggests the level of diversity seen so far is the tip of a very large iceberg.
Phage in the Environment Viruses are now recognized as the most abundant forms of life on earth. An estimated 1031 viruses exist in the oceans alone, probably out numbering their hosts by ten fold (1). Aquatic viruses range in concentration from 104 to 107/ml in the water column (2), and in excess of 108/cm3 in the sediment (3). Viruses play an important role in global ecology, modulating the abundance and diversity of microbial populations through their lytic and lysogenic activity (4, 5) and play a critical role in the global nutrient and energy cycle (6). Phages are also believed to be a major driving force in cellular evolution (7), serving as a primary vector for horizontal genetic exchange and promoting diversity by their predation and lysogeny of the most abundant microbes. They also alter the phenotypes of bacteria they lysogenize.
Phage at High TemperaturesWhile the oceans contain ~1.3 X 1021 liters, ground water aquifers may contain 1019 liters of pore space (8). Marine and fresh water ecosystems are characterized by moderate to cold temperatures, whereas high temperature aquifers predominate at pressure and temperature gradients deep in the earth and in geothermal sites such as Yellowstone National Park and hydrothermal vents on the ocean floor. Temperatures as high as 230°C and pressures of 300 psi have been measured at Yellowstone National Park within 200 feet of the surface (9).
Life in the deep/hot biosphere is purely prokaryotic, improving the chances of more completely describing selected biomes. Furthermore, the geochemical energy in hydrothermal environments allows unique chemolithoautotrophic metabolic pathways. Most work on hot spring extremophiles has concentrated on sedentary surface microbes that grow at less than 100°C; this study looks at planktonic viruses and microbes in the emergent water column, which, presumably, originates at much higher temperatures. Viruses are the only known predators in thermal aquifers and can have a significant effect on hot spring microbial food webs. Breitbart (10) estimates that thermal aquifers may contain 3.7 X 1029 prokaryotic cells and 3.7 X 1030 viruses, which is roughly equivalent to the total estimated viruses in the ocean8. Viruses may be responsible for as much as 3.6 Gtons of carbon turnover per year, which is comparable to the impact of phages on the ocean. Viruses, particularly those of the thermal environments, may be important contributors to global molecular diversity (11). We have used a metagenomics approach to studying the viruses in terrestrial thermal pools.
Sampling of Four Thermal Sites in California and Yellowstone. Numerous sites in Long Valley California and Yellowstone National Park have been sampled. All of the work described focuses on four sites; one in Long Valley and three in Yellowstone. Viral particles were isolated and concentrated from several hundred liters of hot spring water by tangential flow filtration.
ABSTRACT
Viral Library Construction The construction of representative phage community DNA libraries is complicated by low yields of viral DNA, and the need to remove contaminating cellular and free DNA. A series of differential filtration and centrifugation steps was used to isolate and concentrate the phage until epifluorescence microscopy indicated an absence of contaminating microbial cells. Nuclease treatment was used to remove free DNA. We do not use density centrifugation in cesium chloride due to the unacceptable loss of already limited amounts of viral material, the incomplete separation of microorganisms and the potential bias introduced with this technique.
Improved Vectors for Library ConstructionRepresentative libraries of phage DNA are particularly problematic to construct, due in part to toxic coding sequences in their genomes. The CloneSmart® vectors were developed to improve the number of recombinants and reduce cloning bias in libraries. These vectors eliminate transcription and translation of recombinant inserts and have proven effective in cloning numerous otherwise toxic phage genes (personal communication Ry Young, Texas A&M).
Viral Morphotypes
Comparison of viral metagenomic libraries to the GenBank non-redundant database.Panel A) BLASTx results were examined to detect homology to proteins derived from phage, virus or other organisms (mostly bacteria and archaea). BLASTx results were categorized by source of the strongest hit.Panel B) Phage-like genes were categorized in functional groups using keywords shown below.
Little Hot Creek (top, left), Bath (top, right), BearPaw (bottom, left) & Octopus (bottom, right) Hot Spring. Bath Hot Spring shows significant geyser activity due to superheated water emanating from the underground aquifer.
Imaging of Hot Spring Phage. Panel A. Phage particles were captured on an AnoDisk filter (Millipore) and stained with SYBR Gold (Molecular Probes). The particles were imaged using a Bio-Rad 1024 laser scanning confocal microscope (U. Wisc. Keck Imaging Center). The numerous small spots are phage particles. An individual cell can be seen in upper center. Panels B to K are transmission electron micrographs of phages. Panel B shows a phage cultured from YNP. Also shown are phages directly isolated from Moundview (C), Azure (D), Bath (E and I), Octopus (F), Cavern (G), Paint Pots (J), and Azure (K) Hot Springs, all of YNP. (Electron micrographs are courtesy of Sue Brumfield, Montana State University.)
DNA replication/repair
19%
recombination10%
transcription/translation
17%
metabolism/modification
32%
lytic enzyme
11%
structural protein
9%
mobile element 2%
DNA replication/repair
17%
recombination10%
transcription/translation
18%metabolism/modification
29%
lytic enzyme
17%
structural protein 2%
DNA replication/repair
19%
recombination10%
transcription/translation
19%metabolism/modification
25%
lytic enzyme
8%
mobile element
18% mobile
element7%
structural protein 1%
Little Hot Creek BearPaw Octopus
DNA replication/repair
recombination
transcription/translation
Panel B
metabolism/modification
mobile element
Shown are sequence reads in which the strongest similarity was to proteins that match functional group keywords shown in Table 2.
Functional Group Key Word
DNA replication/repair DNA repair, DNA polymerase, replication, primase, helicase, reverse transcriptase, repA, DNA mismatch repair
Recombination integrase, resolvase, intron, terminase, DNA/RNA ligase, recN
Transcription/Translation RNA polymerase, transcription, tRNA synthetase
Mobile elements vector, transposase, pathogen, virulence/virulent, toxin
Nucleic acid metabolism/modification nuclease, nucleotidyl transferase, DNA glycosylase, DNA methylase, reductase, nucleotidase, restriction, RNase, DNase
Lytic enzyme sidase, protease, peptidase, proteinase, trypsin, chitin, lysozome, lysin, lytic
Structural protein tail, tape measure, head
Functional Groupings of Keywords into Categories
NanoClone Library Construction MethodsThe low amounts of starting material necessitate use of a unique genome amplification technology. NanoClone Library Construction (see below) uses DNA sheared to the appropriate molecular weight and end repaired. Linkers are ligated to the ends of the fragments. These linkers serve as priming sites for amplification (see below). The amplification products are cloned into pSMART vectors.
M 50 5 500 50 5 0 gel M ng ng pg pg pg
Anonymous genome amplification technology applied to phage lambda using known amounts of starting material. Amplified DNA was excised from the agarose gel and cloned into pSMART HCKan vector for sequencing. The primary band in the 5 pg lane produced authentic lambda DNA sequences.
Rox
NNNNNNNNNNNNNNNNNNNNNNNNNNNN
Blank
Taq
4110
653
Unextended
Extended
A tailed
3063
3173
488
2323
25 30 35 40 45 50
3’ exo
25 30 35 40 45 50
25 30 35 40 45 50 25 30 35 40 45 50
25 30 35 40 45 50 25 30 35 40 45 50
25 30 35 40 45 50 25 30 35 40 45 50
Thermostable phage DNAP activity assay. ROX-labeled primer/template mix (top) is extended. Unextended product runs at 37 nt. Extension products runs at 41 nt. A-tailing products run at 42 nt. 3’ exo products run at less than 37 nt. Minor peaks at 25, 30, 35, 40, 45, and 50 nt are TAMRA-labeled standards. PyroPhage DNAPs and a Taq control are indicated.
100 nmI 200 nm J
200 nmG200 nm
D
A
200 nmE200 nmF
200 nm
H
The pSMART vectors are “transcription-free”. They eliminate vector-driven transcription of insert DNA and terminate transcription that initiates from promoters within the insert. These vectors allow cloning of many types of difficult DNAs, providing unbiased library construction and recovery of toxic genes.
The pSMART-HC vectors are high-copy (300-500 copies/cell, similar to pUC19). The pSMART-LC vectors contain the ROP gene to decrease the copy number (15-20 copies/cell), providing higher insert stability. In typical blunt cloning experiments, the background of non-recombinant vector is < 0.1%.
TerminatorTerminator
Terminator
Blunt Cloning Site
pSMART
ROP
Ori
Am
p
TerminatorTerminator
Terminator
Blunt Cloning Site
pSMART
ROP
Ori K
an
pSMART -HCKan, -LCKan pSMART -HCAmp, -LCAmp
Phage-like genesA high proportion of sequence similarities are to genes normally found in viruses and phage, confirming the viral origin of the DNA used for library construction.
Frequency of phage-like genes
Molecular Diversity of PyroPhage pol genesPyroPhage polygenes were aligned to known eukaryotic, prokaryotic and viral polymerases using CLUSTAL W. The PyroPhage DNA enzymes are distinct from available thermal stable DNA polymerases.
Viral Metagenomic Libraries from Hot Springs
Library Hot Spring TempºC
Phage/ml in Spring (a)
Vol. Sampled (liters)
Theoretical Yield (ng) (b)
Actual Yield (ng)
% Yield of DNA (c) Vector Average
Insert Size
L1.1 Little Hot Creek, CA 84 not tested 450 --- 80 --- pcrSM 3-6 kb
Y4.9 Octopus, YNP 80 2.5X104 1058 571 10 1.7% pcrSM 2-3 kb
Y4.16 Bearpaw, YNP 74 5.9 X 104 450 573 59 10% pcrSM 2-3 kb
Y2.1 Bath, YNP 93 3.7 X104 360 288 90 31% pUC 3-6 kb
Number Sequence
Reads
No Homology
No Match to Keywords
Virus orPhage
Homology
DNA pol Gene Homology
Completepol Genes
Expressedpol Genes Purified
L1.1 7479 2363 3775 262 14 2 1 1
Y4.9 21,797 12,705 6548 2036 148 34 7 2
Y4.16 7545 2510 3619 333 57 20 2 1
Y2.1 765 200 410 19 3 2
totals 37,586 17,778 14,352 2,650 222 58 10 4
Summary of thermostable phage libraries screened in this project. a) Phage concentrations are based on direct counts by epifluorescence, b) Theoretical yields are total amount of phage DNA in samples assuming an average of 54 attograms of DNA/phage particle and use of only a portion of the sample (40%) for each DNA preparation (phage/ml X vol. sampled (l) X 1000 ml/l X 54 attogram DNA/phage X 40%), c) Percent yield is actual yield divided by theoretical yield.
Sequencing of the Libraries, Assembly of Contigs and Functional GenomicsMore than 37,000 reads (~37 Mb) of viral DNA were determined by DOE Joint Genome Institute. Individual reads were trimmed and vector sequences removed by SeqManII (DNASTAR). Trimmed sequences were assembled using SeqManII (DNASTAR) at a minimum match of 95% and match size of 20 nucleotides.
Assembly of the reads• Little Hot Creek 3.6 Mbp in 3014 contigs• Bearpaw 6.1 Mbp in 6,191 contigs• Octopus 14.4 Mbp in 13,543 contigs
In Silico Analysis of the LibrariesTranslations of the trimmed reads were aligned to known sequences in the GenBank nr protein database using BLASTx. Sequences were considered similar if the Expect value was equal to or less than 0.001.
CloneSmart® Vectors: No Background, No Transcription
Panel A
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0% Little Bear Octopus overall Hot Creek Paw
nonviral viral no similarity
Library LHC Bearpaw Octopus
head 3 10 15
helicase 132 122 274
integrase 67 30 22
ligase 0 11 18
lysin 118 260 389
methylase 50 32 127
methyltransferase 55 100 262
nuclease 94 145 232
polymerase 147 235 548
Library LHC Bearpaw Octopus
primase 14 19 21
reverse transcriptase 42 35 11
restriction 71 34 59
resolvase 24 15 87
ribonuclease 20 13 62
tail 17 38 289
topoisomerase 18 19 24
transposase 208 86 51
Phage Lifestyles Inferences into the lifestyles of phage in thermal aquifers can be based on apparent similarity to genes of known function. For example, the identification of 119 apparent integrase genes suggests that lysogeny is a common lifestyle in thermal aquifers. Likewise, apparent DNA polymerase genes suggest lytic phages, and reverse transcriptases suggest RNA phages.
Thermal Stable DNA Polymerases from Viral Metagenomic LibrariesFor several reasons, thermal stable viral DNA polymerases are expected to be very attractive alternatives to regent enzymes derived from microbial enzymes.
PyroPhage™ DNAPsPyroPhage DNAPs are expressed from the genes of viruses (phage) that inhabit boiling hot springs.
PyroPhage DNAPs as Alternatives for DNA Amplification • Viral DNAPs are replication enzymes, in contrast to currently available DNA repair enzymes, and represent the first viable alternative to existing reagent enzymes.
Based on known viral enzymes, PyroPhage DNPs are expected to have substantial advantages.• Unprecedented Molecular Diversity: This enzyme family is the most diverse known, which suggests a range of activities that can be tailored to the needs of the application.• Improvements can be extrapolated from the few well characterized viral DNAPs: - Excellent strand displacement and processivity will allow detection and analysis of “difficult” sequences. - Lower error rate results in more reliable data. - Absence of end-product inhibition improves quantitative PCR and increases yield. - Strand displacement allows isothermal amplification, which will increase throughput and reduce instrumentation costs. - Reduced stuttering will allow use of optimal markers for genetic tests. - Improved incorporation of nucleotide analogs will broaden the range of detection methods. - Accessory functions may expand functionality.
Detection of PyroPhage Coding Sequences by Sequence SimilarityThe BLASTx search revealed 222 apparent pol genes. Of these, 58 appear to be complete genes. All of the latter were tested for expression using a DNA polymerase assay developed at Lucigen.
Family A PyroPhage Polymerase Percent Identities
Pyro
Phag
e 30
63
Pyro
Phag
e 48
8
Pyro
Phag
e 31
73
Pyro
Phag
e 96
7
Aae
Taq
Tth
Tma
T5_p
hage
Bst
Eco_
polI
T7_p
hage
PyroPhage 3063 ---
PyroPhage 488 31 ---
PyroPhage 3173 28 45 ---
PyroPhage 967 31 51 82 ---
Aae 63 29 23 34 ---
Taq 24 25 22 25 22 ---
Tth 26 24 18 26 25 85 ---
Tma 30 22 22 28 28 42 42 ---
T5_phage 15 19 17 19 23 18 18 21 ---
Bst 24 23 20 28 24 40 41 40 14 ---
Eco_polI 30 22 19 26 28 38 38 43 21 40 ---
T7_phage 16 13 15 17 18 17 17 16 12 15 13 ---
Family B PyroPhage Polymerase Percent Identities
Pyro
Phag
e 41
10
Pyro
Phag
e 23
23
Pyro
Phag
e 27
83
Tli
Pfu
Pae
RM37
8_ph
age
HHV
CMV
vacc
inia
_viru
s
bacu
lovi
rus
Chlo
rella
viru
s
T4_p
hage
Eco_
polII
Ia
Hum
an_p
olal
pha
Phi2
9Pha
ge
PyroPhage 4110 ---
PyroPhage 2323 83 ---
PyroPhage 2783 91 86 ---
Tli 23 21 22 ---
Pfu 23 21 22 74 ---
Pae 27 26 27 35 35 ---
RM378_phage 14 14 15 13 15 13 ---
HHV 18 17 17 19 16 21 13 ---
CMV 19 15 17 18 17 21 15 28 ---
vaccinia_virus 19 19 18 16 15 17 14 13 16 ---
baculovirus 12 11 14 16 15 10 10 14 14 13 ---
Chlorellavirus 20 20 20 23 20 20 14 19 24 17 17 ---
T4_phage 15 16 17 16 17 12 13 12 12 10 9 11 ---
Eco_polIIIa 12 12 12 12 14 12 9 12 8 10 7 8 9 ---
Human_polalpha 17 18 15 21 23 21 18 15 14 13 14 18 13 10 ---
Phi29Phage 9 9 11 10 8 5 10 9 12 10 13 11 7 4 9 ---
Isothermal Amplification using PyroPhage DNA polymerasesFive units of 3173 polymerase was used to amplify one nanogram each of ssM13mp18 and pUC19 plasmid DNA. Random decamer primers were added to 0.5 μM or 5 μM, as indicated. Reactions were incubated at 95°C prior to addition of enzyme, then 16 hours at 55°C with enzyme. One fiftieth of each reaction was resolved on a 1% agarose gel. Results are shown in Panel A. To verify if the amplification was specific for the template DNA, one μl of the amplification product of the positive pUC19 reaction was tested in a PCR reaction using primers specific for the ampicillin resistance gene of the original plasmid template. As a negative control, a reaction containing all the components of the positive reaction, including input DNA, but without nucleotides to prevent amplification, was tested by PCR. As a positive control, the same sequence of was amplified directly from 1ng of pUC19.
Additional amplifications using 2.5 units of 488 DNA polymerases and 15 units of 967 DNA polymerase are also shown. In both cases, the reaction conditions were essentially the same as described except that the template was double-stranded, linear lambda phage DNA, the reaction temperature was 50°C and the concentration of magnesium was varied from 1.5 to 20 mM.
Family A5’ Exo Domain
Family A Active Site.Discrimination againstmodified nucleotides
Family B3’ Exo Domain
Family B Active Site. Discrimination againstmodified nucleotides
1091 LSTSSGFPTGATma LSTSTGIPTNA Tth LTTSRGEPVQA (D) Taq LTTSRGEPVQA
3063 RQLAKAVNFGLIYG2710 RQIGKSANFGLIYG488 RQIAKSANFGLIYG1795 RQIAKSANFGLIYGT7 Ph RDNAKTFIYGFLYGAae RQLAKAINFGLIYGTma RRAGKMVNFSIIYG Tth RRAAKTINFGVLYG (D) (Y)Taq RRAAKTINFGVLYG
3063 FLYIDTETVGD4110 VAAFDIEVDAT2323 VAAFDIEVDATPfu ILAFDIETLYH (A A) Vent LLAFDIETFYH
4110 QFAFKLILVSAYG3001 QFAFKLILVSAYG2323 QFAFKLILVSAYGPfu QKAIKLLANSFYG (L)Vent QRAIKLLANSYYG
Alignment of PyroPhage DNA polymerase domains with commercially relevant motifs. Selected PyroPhage DNAP sequences are aligned to mutations of domains of Taq and Vent that improve these enzymes as reagents. Also shown are Tth, Tma, and phage T7 domains. Conserved amino acids (compared to Taq and Vent) are shown in red. Amino acids that were substituted to create G46D (Taq 5’ exo domain), R660D/F667Y (Taq active site), D141A/E143A (Vent 3’ exo domain) and A488L (Vent active site) and are shown in blue. Variations different from both wild type and modified are shown in green.
Other Interesting Genes Found in Contigs from the Libraries. A cultivated Pyrobaculum spherical virus has 83% identity to a 3 kb contig from Bear Paw hot springs metagenomic DNA.
BLASTx Analysis of selected contigs(BearPaw) CONTIG 1494 ≈ 3 kb
Matches Pyrobaculum spherical virus• A non-lytic dsDNA virus of Pyrobaculum sp. D11 (PSV)• Isolated from Obsidian Pool YNP• Genome is 28 kb• ORFs have no gene matches to Public databases
Conservation of gene order and identity between the BearPaw CONTIG 1494 and Pyrobacullum Sp. Virus
ORF88A ORF137 ORF235 ORF239 ORF211 ORF107 89% ID 88% ID 87% ID 87% ID 79% ID 77% ID
500 1000 1500 2000 2500
500 1000 1500 2000 2500
BtpA Photosystem 1 Phage Integrase biogenesis protein
References1. Suttle CA. (2005) Viruses in the sea. Nature 437(7057):356-61. 2. Wommack KE, Colwell RR. 2000 Virioplankton: viruses in aquatic ecosystems. Microbiol Mol Biol Rev. 64(1):69-114.3. Breitbart M, Felts B, Kelley S, Mahaffy JM, Nulton J, Salamon P, Rohwer F. (2004) Diversity and population structure of a near-shore marine-sediment viral community. Proc R Soc Lond B Biol Sci.
271(1539):565-74. 4. Chibani-Chennoufi S, Bruttin A, Dillmann ML, Brussow H (2004) Phage–host interaction: An ecological perspective. J Bacteriol 186: 3677–3686.5. Paul, JH. Microbial gene transfer: an ecological perspective. J Mol Microbiol Biotechnol. 1999 1:45-50.).6. Fuhrman, JA. Marine viruses and their biogeochemical and ecological effects. 1999. Nature 399, 541-5487. Weinbauer MG, Rassoulzadegan F. (2004) Are viruses driving microbial diversification and diversity? Environ Microbiol. 6(1):1-11. 8. Gold T (1992) The deep, hot biosphere. Proc Natl Acad Sci U S A 89: 6045-9.9. White, D. E., Fournier, R.O., Muffler, L.J.P, and Truesdale, A.H. (1975). Physical results of research drilling in thermal areas of Yellowstone National Park, Wyoming. Geological Survey Professional
Paper No. 892: 1-89.10. Breitbart M, Wegley L, Leeds S, Schoenfeld T, Rohwer F. (2004) Phage community dynamics in hot springs. Appl Environ Microbiol. 70(3):1633-40.11. Villarreal LP, DeFilippis VR. (2000) A hypothesis for DNA viruses as the origin of eukaryotic replication proteins. J Virol. 74(15):7079-84.
SupportThis material is based upon work supported by the National Science Foundation under Awards Number 0109756 and 0215988, National Human Genome Research Institute under Award Number 1 R43 HG02714-01 to TS and Department of Energy under Award Numbers 70588S02-I and DE-FG02-02ER83484 to DAM. Any opinions, findings, and conclusions or recommendations expressed in this poster are those of the authors and do not necessarily reflect the views of NSF, NHGRI or DOE. Sequencing was performed at DOE Joint Genome Institute as part of their Genomes-to-Life program.
Panel A Isothermal amplification
of circular templates using PyroPhage 3173 POL
template --ssM13-- --pUC-- noneprimer uM 5 0.5 5 0.5 5
10 kb-6 kb-
3 kb-
1 kb-
Panel B Verification by PCR of
amplification specificity
-10 kb-6 kb-3 kb
-1 kb
pUC
cont
rol
pos
amp
neg
amp
Panel CIsothermal amplification of lambda DNA
using PyroPhage 488 and 967 POLs
.........488 .......... ......... 967.........
mgC12 mM 1.5 15 1.5 ......................20 1.5 15 1.5 ......................20template - - + + + + + + - - + + + + + +
10 kb-6 kb-3 kb-
1 kb-
2120 W. Greenview Drive Middleton, WI 53562 www.lucigen.com1-888-575-9695
Identification of a Photosystem I gene next to an integrase. Little Hot Creek contig 27 (95% assembly) was compared to the GenBank nr database. The 2740 bp contig showed homologies to two similar genes, a Photosystem I biogenesis protein from Thermosynechococcus elongatus and a phage integrase from Rhodoferax ferrireducens.
B 100 nm
200 nmC
Fa
mily
A
Fam
ily B
Ten clones express DNAPClone Source Strongest similarity E value % identity % conserved Exo
3063 BearPaw Aquifex pyrophilus pol I 0.0 63 79 3’
488 Little Hot Creek Aquifex pyrophilus pol I 1e-46 33 51 No
3173 Octopus Desulfitobacterium hafniense pol I 2e-37 30 48 3’
4110 Octopus Pyrodictium occultum pol II 3e-55 28 46 No
2323 Octopus Pyrobaculum aerophilum pol II 1e-47 28 45 3’
653 Bearpaw Pyrococcus furiosus virus pol 2e-12 37 59 3’
967 Octopus Aquifex aeolicus pol I 3e-44 36 53 No
2783 Octopus Sulfolobus tokodaii pol II 3e-56 27 46 3’
2072 Octopus Sulfolobus tokodaii pol II 2e-10 39 60
2123 Octopus Pyrococcus abyssi pol II 1e-04 35 51
photosystem I biogenesis protein [Thermosynechococcus elongatus BP-1] Length=296
Score = 52.0 bits (123), Expect = 9e-06 Identities = 43/136 (31%), Positives = 66/136 (48%), Gaps = 11/136 (8%) Frame = +1
Query 133 MGVDSSGLTGEGEALMRLAAE-HPSIEFFASVAFKYMP--EEPDPVTAADNARMAGFVPT 303 M D + G L+R E I+ FA V K+ P+ TA + G Sbjct 125 MATDQGLIEGPAHQLLRYRRELGQDIKIFADVMVKHAQPLHSPNLATAVRDTFDRGLADG 184
Query 304 T--SGSATGAPPDLE--KIRAMAARG-PLAVASGMTPDNVHLYSPYLSDILVATGIAAD- 465 SG ATG PP E + A AA+G PL + SG + DNV PY++ ++VA+ + + Sbjct 185 VILSGWATGQPPTEEDLSVAARAAKGQPLFIGSGASWDNVAQLVPYVNGVIVASSLKRNG 244
Query 466 --EHHLDPGKLARFIK 507 E +DP +++RF++Sbjct 245 QIEQPIDPIRVSRFVE 260
Phage integrase [Rhodoferax ferrireducens DSM 15236] Length=347
Score = 74.7 bits (182), Expect = 4e-12 Identities = 80/287 (27%), Positives = 117/287 (40%), Gaps = 31/287 (10%) Frame = +1
Query 106 RDSETYRHIWSTWCKYLQGGQAGGRSRPIPWYEVDAATVVGFLQ--SGPASRKEKLESSS 279 R + YR W W +L PW + V +L S A+ ++ +SSSbjct 45 RSVKQYRSTWFNWVAWLPPHT--------PWEKAAPEQVSAYLHQLSASATARQTQPNSS 96
Query 280 NE----TTKRRYWRVLDRIYNYAKAHNWVDSNPLVGLTTNDKPKSEDTLGTILDPHVWHA 447 T+RRYWR+L IY +A W ++NP T + P SE IL Sbjct 97 RRPASTVTQRRYWRMLRDIYAHAVVMAWCEANPCAQAT--EIPASEAMASMILPAWALRQ 154
Query 448 AEKLLAHPDRFDPI----SVRNRAILQILFGLGLAPQEVRALKTXXXXXXXXXXXXVNPS 615 + + H + VRN A+L +L G E+ +L+ L +E Sbjct 155 LQDGILHQASRQAVRKWQDVRNDALLLLLLHTGAKTGELVSLRVDQALKIRTEKHG-EQW 213
Query 616 KVHVDGHNTLRPRTLTLC-PKTSAAIREWLKARPAVAT--------SKSGQILFCTPKGP 768 + +DG + R +TL P+ AA+ +WL+ R V +KS I PSbjct 214 AIQIDGEKDCQQRHITLDEPRAGAALAQWLRVRQHVPRKSPWLFFGAKSHVIDGKRELSP 273
Query 769 LGSVSLYLLVKSFLKKASDLAQREEP-PQAGPQVIRNSVLVRLLEDG 906 L S ++++LV LK E AG + IRNSVL R LE GSbjct 274 LSSKTIFILVAGALKAHLPPNTFEGMLSHAGAEAIRNSVLARWLEAG 320
lytic enzyme
structural protein