S1
Supplementary Information
Evolution of Functional Six-Nucleotide DNA
Liqin Zhang,†,∥ Zunyi Yang,‡,∥ Kwame Sefah,† Kevin M. Bradley,‡ Shuichi Hoshika,‡ Myong-Jung Kim,‡ Hyo-Joong Kim,& Guizhi Zhu1,†,# Elizabeth Jiménez,† Sena Cansiz,† I-Ting Teng,† Carole Champanhac,† Christopher McLendon,‡ Chen Liu,§ Wen Zhang,≠,¶ Dietlind L. Gerloff,‡ Zhen Huang, ≠,¶,* Weihong Tan,†,#,* and Steven A. Benner‡,&*
†Department of Chemistry, Department of Physiology and Functional Genomics, UF Health
Cancer Center, UF Genetics Institute, University of Florida, Gainesville, Florida 32611, United
States, ‡Foundation for Applied Molecular Evolution, Gainesville, Florida 32601, United States,
&Firebird Biomolecular Sciences LLC, Alachua, Florida 32615, United States,
#Molecular Science
and Biomedicine Laboratory, State Key Laboratory of Chemo/Biosensing and Chemometrics,
College of Chemistry and Chemical Engineering, College of Biology, Collaborative Innovation
Center for Chemistry and Molecular Medicine, Hunan University, Changsha 410082, China,
§Department of Pathology, Immunology, and Laboratory Medicine, University of Florida College
of Medicine, Gainesville, Florida 32610, United States, ≠Department of Chemistry, Georgia State
University, Atlanta, Georgia 30303, United States, ¶SeNA Research, Inc., Atlanta, Georgia 30303,
United States
Methods
Crystallization and X-ray crystal structure determination of DNA1,2
. To supplement
crystal structures reported in a separate manuscript [Georgiadis et al. 2015], we used
selenium derivatization to obtain a crystal structure for a free-standing, short
oligonucleotide in the A-form. Here, two purified DNA oligonucleotides (5’-G5-MeSedUGT-
Z-ACAC-3’ and 5’-G5-MeSedUGT-P-ACAC-3’, 1 mM, 30 µL each), each carrying a
thymidine whose methyl group was replaced by a Me-Se unit, were heated to 80 °C for 2
min, and cooled slowly to room temperature. The Hampton Research Nucleic Acid Mini
Screen Kit was applied to screen crystallization conditions at different temperatures (5,
20 and 25 °C) by hanging-drop vapor diffusion. The DNA crystallized in several weeks
S2
in 12 out of 24 buffer conditions. 35% MPD was used as a cryoprotectant during the
crystal mounting, and data collection was taken under the liquid nitrogen stream at -
174 °C. The crystal data was collected at beam line 9-2 in SSRL of Stanford Synchrotron
Radiation Lightsource. The crystal from buffer [10% (v/v) MPD, 40 mM sodium
cacodylate, 12 mM spermine tetra-HCl (pH 6.0), 80 mM NaCl, 20 mM MgCl2] was used
for data collection. The distance between the detector and the crystal was set to 200 mm.
The crystal was exposed for 5 s/image with 1° oscillation, and a total of 180 images were
taken. All the data were processed using HKL2000. The structure of modified DNA was
solved by molecular replacement, with native DNA duplex (5’-GTGTCACAC/5’-
GTGTGACAC-3’) as a model.
The refinement protocol includes simulated annealing, positional refinement, restrained
B-factor refinement, and bulk solvent correction. The topologies and parameters for with
6-amino-5-nitro-3-(1′-β-D-2′-deoxyribofuranosyl)-2(1H)-pyridone (Z) and 2-amino-8-
(1′-β-D-2′-deoxyribofuranosyl)-imidazo[1,2-a]-1,3,5-triazin-4(8H)-one (P) were
constructed and applied. After several cycles of refinement, a number of highly ordered
water molecules and metal ions were added. Data collection, phasing, and refinement
statistics of the determined structure are listed in Supplementary Table 1, 2.
Synthesis and purification of GACTZP libraries containing four natural nucleotides
(G, A, C, and T) and AEGIS nucleotides (Z and P) to support AEGIS-LIVE. All dZ
and dP containing oligonucleotides (Supplementary Table 3) were synthesized using
standard phosphoramidite chemistry on glass supports (CPG) on an ABI 394 DNA
Synthesizer. Protected dZ and dP phosphoramidites were purchased from Firebird
Biomolecular Sciences LLC (Alachua FL, www.firebirdbio.com). Standard
S3
phosphoramidites (Bz-dA, Ac-dC, dmf-dG, and dT) were from Glen Research (Sterling
VA). The oligonucleotides were designed to have forward and reverse primer binding
segments (each 18 nucleotides in length) with a random region (25 nts) containing
GACTZP (six nucleotides) at each site in equimolar concentrations. Coupling times were
60 seconds.
The CPG-bound DMT-off DNA molecules were incubated with acetonitrile-
triethylamine (1:1 v/v, 1.5 mL) for 1 hour at 25 oC, followed by removal of supernatant,
the CPG-bound oligonucleotides were treated with another 1.5 mL of triethylamine-
acetonitrile (1:1 v/v) for overnight at 25 oC. After removal of supernatant, the CPG-
bound oligonucleotides were incubated with 1.0 mL of DBU in anhydrous CH3CN (1 M)
at room temperature for ~18 hours to remove the protecting groups on dZ. After removal
of CH3CN, dZ and dP containing oligonucleotides were retreated with NH4OH (55 °C,
overnight). The product mixture was resolved by denaturing PAGE (7 M urea), and
extracted with TEAA buffer (0.2 M, pH=7.0). The product was then desalted by Sep-
Pac Plus C18 cartridges (Waters). All 5`-biotinylated dZ and dP containing potential
aptamers were synthesized, deprotected, and purified in house based on the above
methods. All standard 5`-biotinylated oligonucleotides were purchased from IDT and
purified by HPLC.
Cell lines. Liver cancer cell line HepG2 (ATCC HB-8065) was purchased from the
American Tissue Culture collection (ATCC). The negative cell Hu1545V was established
by immortalizing primary hepatocyte (normal liver cell) with lentivirus carrying hTERT
(human telomerase reverse transcriptase, the enzyme maintains telomere length at the end
of chromosomes thus enables cells to grow and proliferate). Though equipped with
S4
extended lifespan, this cell still remains the characteristic, most importantly the protein
profile, of normal liver cells compared to liver cancer cells. Both liver cancer cells
(HepG2, ATCC® HB-8065™) and normal liver cell Hu1545V were maintained in high
glucose DMEM culture medium (Sigma-Aldrich, St. Louis, MO) supplemented with 10%
Fetal Bovine Serum (Gibco®, Life Technologies, Carlsbad, CA) and 1% penicillin-
streptomycin (Life Technologies, Carlsbad, CA). Incubate cultures at 37°C with 5% CO2.
Since this cell line is adherent, 95-100% confluent cell culture dishes were used
throughout the entire selection process.
Experimental procedure of AEGIS Cell-LIVE. To begin the AEGIS-LIVE experiment,
HepG2 cells were seeded in 10 cm culture dishes. These cells adhere to the bottom of
dishes and grow to about 97% coverage. Cells were washed with washing buffer (4.5
g/liter glucose, 5 mM MgCl2 in Dulbecco’s PBS). A sample of the DNA library (20 nmol)
was dissolved in binding buffer (700 µL, 4.5 g/liter glucose, 5 mM MgCl2, 0.1 mg/mL
tRNA and 1 mg/mL BSA, all in Dulbecco’s PBS). The GACTZP DNA library was
denatured by heating (85 oC, 5 min), and then “snap cooled” on ice for 10 min. The
library was then incubated with the cells still adhering to the bottoms of the culture dish
(4 oC with rocking for 30-60 min). Cells were thrice gently washed with washing buffer
to remove unbound sequences. Binding buffer (0.5 mL) was then added and the cells
scraped off the plate using cell scraper to recover cell/DNA complexes.
Once the cells were scraped from the bottom of the dish into a suspension in PBS
buffer, they were heated (85 oC for 10 min). The resulting mixture centrifuged at 14000
rpm to pellet the cell debris. The supernatant containing the ssDNA survivors were
further incubated with counter cell attached to the dish bottom at 4 °C (with rocking, 1 h).
S5
The survivors were collected in the supernatant, and then amplified by six-nucleotide
PCR using fluorescein- and biotin-labeled primers (Supplementary Table 3) with all six
nucleoside triphosphates (dZTP, dPTP, dGTP, dATP, dCTP, and dTTP).
Different PCR cycles (from 8 cycles to 25 cycles) were tested to determine the
optimum number of cycles for preparative PCR to produce maximal amount of amplicon
with the least Z/P loss. Typical six-nucleotide PCR reagents and conditions are listed in
Supplementary Table 4. Upon the completion of six-nucleotide PCR, the FITC-labeled
DNA strands were separated from the biotinylated strands by affinity purification with
streptavidin-coated Sepharose beads (GE Healthcare Bio-Sciences Corp., Piscataway),
followed by alkaline denaturation (with NaOH, 100 mM). The surviving ssDNA was
desalted and resuspended in binding buffer to a final concentration of 0.5 µM. The
survivors were denatured at 85 oC, snap cooled and used to perform the second round of
selection using the same procedure as described for the first round of selection. The
negative selections were added only from Round 3-6. The entire selection process was
repeated until a sustained significant enrichment was obtained at the 13th round. During
the selection, the stringency of the selection was increased by decreasing the number cells
and the incubation times (Supplementary Table 5).
Deep sequencing of GACTZP DNA survivors using Next Generation sequencing
technology. Solutions containing enriched GACTZP DNA survivors after the 13th round
of AEGIS Cell-LIVE were divided into two equal parts. These were separately converted
into standard DNA under two conversion conditions using primers that carried barcodes
for the Ion Torrent deep sequencing (Supplementary Table 6). Following conversion,
the samples were combined, purified by native agarose gel, and submitted for Ion Torrent
S6
“next generation” sequencing at the University of Florida, ICBR sequencing core facility.
The products were aligned to identify sequences derived from a single common aptamer
“ancestor”, and the ancestral sequence was inferred (see below).
Inference of GACTZP aptamer sequences. Ion Torrent sequencing reads that did not
contain exact matches to the barcode, forward and reverse priming sequences were
discarded. To minimize miscalling, any read present in less than 45 copies was removed
from the analysis. The remaining reads (1,877,526 out of 2,226,873) were then clustered
using software custom designed at the FfAME, which ignored differing barcodes during
the clustering and accepted single-step changes within sequence reads. Clustered
sequences were then separated by barcode, with variable sites being compared between
each barcode (differentiating the two conversion conditions). The clustered sequences
obtained under the first conversion conditions (Barcode A, Z to C and P to G conversion)
serve as reference for the clustered sequences obtained under the second conversion
conditions (Barcode B, Z to T/A and P to C/G conversion). Sites where C and T were
found in approximately equal amounts after conversion under the second conditions were
assigned as Z in their “parent”. Sites where G and A were found in approximately equal
amounts after conversion under the second conditions were assigned as P in their “parent”
(Supplementary Table 7).
Screening of potential aptamer candidates. Analysis of the Ion Torrent sequencing
output identified with decreasing abundance several different aptamer species holding Z
and/or P, as well as species containing only standard bases. Each sequence was
chemically synthesized, labeled with biotin at the 5’ end, and then purified by HPLC (for
IDT-derived oligos) or PAGE (for oligos prepared in house). These were quantified (UV
S7
260/280) and diluted to standard concentrations. Flow cytometry binding assays were
then done using the target HepG2 cells. To obtain suspended cells for flow cytometry,
culture medium was removed from the cells and non-enzymatic dissociation buffer was
added to cover the surface of the entire flask. This was placed in an incubator at 37˚C.
After incubation (5 min), the cells were aspirated using a transfer pipette to remove them
from the flask. This was washed twice by centrifugation and approximately 5.0 x105 cells
were incubated separately with the aptamer candidates at a final concentration of 250 nM.
After incubation, cells were washed. Streptavidin-PE conjugate (100 µL of 1:400 dilution,
optimized) was then added, and the mixture was incubated at 4 ˚C for 10 min. Excess dye
conjugates were removed by washing twice and the cell-DNA complexes resuspended in
150 µL binding buffer. The aptamer binding signal was detected using flow cytometry
(AccuriTM C6 BD). Unselected library was used as a control to set the fluorescence
background.
Determination of binding affinity. The binding affinity of the most abundant aptamers
was measured by flow cytometry using biotin-labeled aptamer, and the signal was
detected with streptavidin-PE conjugate. HepG2 cells were dissociated using non-
enzymatic dissociation buffer. Cells were washed and incubated with varying
concentrations (0.1 nM - 1000 nM final concentration) of biotin- labeled aptamer in a 200
mL volume of binding buffer. After 20 min of incubation, cells were washed twice with
washing buffer and then incubated with conjugate dye (100 mL, 1:400 dilution). This was
incubated (10 min) and then washed twice (1300 µL each) with washing buffer. The cell
pellets were resuspended in washing buffer (200 µL) and analyzed by flow cytometry.
S8
The biotin-labeled unselected library was used as a negative control to determine the
background binding. All binding assays were done in triplet. The mean fluorescence
intensity of the unselected library was subtracted from that of the corresponding aptamer
with the target cells to determine the specific binding of the labeled aptamer.
S9
Figure S1 Crystal structures of Z:P pair in a duplex and comparison of some
proposed expanded genetic alphabets. (a and b) C:G and T:A pairs (top) molecule structure (bottom) spacing filling (from this work). (c) Z:P pair retaining hydrogen bonding (top) molecular structure as designed (bottom) space filling (from this work). (d) F:Z from Kool3 (top) molecular structure as designed (bottom) space filling (from an NMR structure in solution [PDB ID:1EEK]). (e) The 5SICS:NaM structure from Romesberg4 (top) molecular structure as designed (bottom) space filling showing intercalation (from crystal structure [PDB ID: 4C8L]). (f) The Px:Ds from Hirao5,6 (top) molecular structure as designed (bottom) modeling stimulation by ChemDraw, as no experimental structure is known. DNA sequences [5’-G5-MeSedUGT-Z-ACAC-3’ and 5’-G5-MeSedUGT-P-ACAC-3’] (resolution: 1.7 Å; PDB ID: 4RHD). The unit cell contains four duplexes. (g) One of the four duplexes in the unit cell. (h) Pairs of near-identical duplexes in the unit cell shown superimposed on all 368 atoms, with its backbone abstracted to a ribbon; Duplex 1 (Chains A-B; green) and Duplex 3 (Chains E-F; gray) have very similar structures, with a rms heavy atom deviation of 0.54 Å; Duplex 2 (Chains C-D; gray) and Duplex 4 (Chains G-H; gray) were also similar, with deviation of 0.72 Å. (i) Z:P base pair with Fo-Fc density map. The average hydrogen bond lengths are 3.0, 3.0, and 2.9 Å (top, middle, bottom hydrogen bonds, from the major to the minor groove). (j) Stacking of the Z/P pair with adjacent nucleobase pairs.
S10
Figure S2 Monitoring the progress of GACTZP AEGIS Cell-LIVE using flow
cytometry. The binding affinity of survivors was monitored in bulk from 8th round up to
13th round of selection. The vertical axis (Events) indicates the number of cells counted
having the fluorescence intensity indicated by the horizontal axis. A higher intensity
indicates a larger number of fluorescein-labeled aptamers bound per cell. Binding assays
showed a noticeable increase in bulk binding of the pool to liver cancer cells (HepG2)
after 13th round of selection, but not to untransformed liver cells (Hu1545V, the “counter
cell”). This is attributed to a substantial enrichment in the pool of individual aptamer
species that have affinity for the transformed cells (HepG2) but not to the untransformed
cells (Hu1545V).
S11
Figure S3 Binding curves of selected aptamers. Conditions the same as given in the
Methods section.
S12
Figure S3 (continued) Binding curves of selected aptamers.
S13
Figure S3 (continued) Binding curves of selected aptamers.
S14
Table S1 Diffraction Data collection and Statistics
Wavelength, Å 0.9795
Resolution range, Å 50.00-1.70
(last shell) (1.76-1.70)
Unique reflections 37630 (3754)
Completeness, % 99.5 (100)
Rmerge ,% 12.7 (43.4)
<I/ σ(I)> 11.1 (5.0)
Redundancy 3.9 (4.0)
S15
Table S2 Structure Refinement and Model Statistics.
Complex Name 5’-GTGTZACAC-3’/ 3’-CACAPTGTG-5’
PDB code 4RHD
Space group P3
Number of molecules per unit cell 4
a=b 59.429
c 87.048
α=β (°) 90
γ (°) 120
Unique reflections 37630
Rmerge (%) overall (final shell) 12.7 (43.4)
<I/sigma> overall (final shell) 11.1 (5.0)
Resolution range (Å) 50.00 - 1.70 R work 0.20 R free 0.21
R factor 0.20
Completeness overall (final shell) 99.5% (100%)
RMS deviation from ideality Bond lengths (Å) 0.022
Angle distances (degree) 3.20
Average B-factors (Å2) 31.29
Number of Atoms Nucleic Acid 1457
Water 70 Ion 3
Wilson Plot Null
Overall Anisotropic B-values
B11/B22/B33 0.00/0.00/0.00
S16
Table S3 GACTZP DNA library, 6-nucleotide PCR primers, and barcoded primers
for deep sequencing
Name Sequence
Initial GACTZP DNA Library
5'- ATCCAGAGTGACGCAGCA- (N)25 -TGGACACGGTGGCTTAGT -3'
N = A, G, C, T, Z, and P nucleotides mixture
Forward Primer 5’-FITC-ATC CAG AGT GAC GCA GCA-3’
Reverse Primer 5’-biotin-ACT AAG CCA CCG TGT CCA-3’
A_CodeA_Forward_58mer
Adaptor A Key Barcode A Forward Primer
5’-CCATCTCATCCCTGCGTGTCTCCGACTCAG-GATGATTGCC-ATCCAGAGTGACGCAGCA -3`
A_CodeB_Forward_58mer
Adaptor A Key Barcode B Forward Primer
5’-CCATCTCATCCCTGCGTGTCTCCGACTCAG-CTTACACCAC-ATCCAGAGTGACGCAGCA -3`
trP_Reverse_41mer
Adaptor trP1 Reverse Primer
5’-CCTCTCTATGGGCAGTCGGTGAT-ACCTGTGCCACCGAATCA-3`
S17
Table S4 Typical six-nucleotide PCR amplification of GACTZP DNA library
Reagents Volume (µL) Final Concentration
ddH2O 30.5
Forward and reverse primers mixture (each 10 µM) 2.5 0.5 µM Six-Nucleotide Mix of 10x
dA,T,G/TPs (1 mM of each)
dCTP (2 mM)
dZTP (1 mM)
dPTP (6 mM)
5.0
0.1 mM of each
0.2 mM
0.1 mM
0.6 mM 10x TaKaRa PCR Buffer (pH = 8.3) 5.0 1x
GACTZP DNA library (survivors) 5.0 (10% of reaction volume)
Takara Taq HS DNA polymerase (5 uints/µL) 2 0.10 (U/µL)
Total volume (uL) 50.0
Note: 1 x TaKaRa PCR Buffer (10 mM Tris-HCl, 50 mM KCl, 1.5 mM MgCl2, pH 8.3 at
25 °C); PCR cycling conditions: one cycle of 90 °C for 1.5 min; 8 cycles ~ 25 cycles of
(90 °C for 30 s, 55.5 °C for 30 s, 72 °C for 5 min); 72 °C for 10 min; 4 °C for extended
times.
S18
Table S5 Selection Procedure
Round Cell involved Culture dish size (diameter) Incubation time
1-2 Positive selection only 100 mm 1 h 3-4 Positive and negative
selection 100 mm for both 1 h for each
5-6 Positive and negative selection
60 mm for positive and 100 mm for negative
45 min for positive and 1 h for negative
7-13 Positive selection only 60 mm 30 min
S19
Table S6 Converting Z:P to C:G (barcode A) or converting Z:P to T:A and C:G
(barcode B)
Components
Z:P to C:G
conversion
Z:P to T:A and C:G
conversion
Final Concentration
ddH2O 33 µl 33 µl Total volume: 50 µl
A_CodeA_For_56mer (10 µM)
trP_ Rev_39mer (10 µM)
2 µl
2 µl
0.4 µM
0.4 µM
A_CodeB_For_56mer (10 µM)
trP_ Rev_39mer (10 µM)
2 µl
2 µl
0.4 µM
0.4 µM
13th-Round Survivors 1 µl 1 µl
10x Five-Nucleotide Mix
dZTP (0.1 mM)
dC,G/TPs (4 mM of each)
dT,A/TPs (0.4 mM)
5 µl
0.01 mM
0.4 mM of each
0.04 mM of each
10x Five-Nucleotide Mix
dPTP (2 mM)
dC,G/TPs (1 mM of each)
dT,A/TPs (1 mM)
5 µl
0.2 mM
0.1 mM of each
0.1 mM of each
10x ThermoPol Buffer (pH 8.8) 5 µl 5 µl 1x
JumpStart Taq (2.5 uints/µl, Sigma) 2 µl 2 µl 0.1 (U/µl)
Note: 1) 1 x ThermoPol Reaction Buffer (20 mM Tris-HCl, 10 mM (NH4)2SO4, 10 mM
KCl, 2 mM MgSO4, 0.1% Tritonx-100, pH 8.8 at at 25 °C); 2) PCR conditions: one cycle
of 94 °C for 1 min; 12 cycles of (94 °C for 20 s, 57 °C for 30 s, 72 °C for 90 s); 72 °C for
10 min; 4 °C for extended times.
S20
Table S7 Examples of conversion sequencing data
LZH9: 481888 Sequences, 25.67%
Barcode: CTTACACCAC, Z to T/C conversion; P to A/G conversion. Barcode: GATGATTGCC, Z to C and P to G conversion.
Position Main Base Main % 2nd Base 2nd % 3rd Base 3rd % 4th Base 4th % Position Main Base Main % 2nd Base 2nd % 3rd Base 3rd % 4th Base 4th % Z or P Conversion
19 T(217368) 99.4 C(1270) 0.6 19 T(257352) 97.8 C(5898) 2.2
20 A(218148) 99.8 G(490) 0.2 20 A(261693) 99.4 G(1557) 0.6
21 T(207942) 95.1 C(10696) 4.9 21 T(234948) 89.2 C(28302) 10.8
22 C(215302) 98.5 T(3336) 1.5 22 C(258399) 98.2 T(4851) 1.8
23 A(108316) 49.5 G(102944) 47.1 T(7378) 3.4 23 G(259638) 98.6 A(3612) 1.4 P Conversion
24 G(218638) 100 24 G(263250) 100
25 T(204939) 93.7 A(7523) 3.4 C(6176) 2.8 25 T(238550) 90.6 C(15094) 5.7 A(9606) 3.6
26 T(157872) 72.2 C(59806) 27.4 G(813) 0.4 A(147) 0.1 26 T(186276) 70.8 C(75258) 28.6 G(1716) 0.7
27 G(217258) 99.4 A(1380) 0.6 27 G(263250) 100
28 C(218432) 99.9 T(206) 0.1 28 C(263250) 100
29 C(218179) 99.8 T(459) 0.2 29 C(263250) 100
30 C(206517) 94.5 T(11985) 5.5 A(136) 0.1 30 C(253821) 96.4 T(9429) 3.6
31 T(212343) 97.1 C(6226) 2.8 A(69) 0 31 T(252696) 96 C(10554) 4
32 T(216055) 98.8 C(2583) 1.2 32 T(258481) 98.2 C(4769) 1.8
33 A(216841) 99.2 G(1797) 0.8 33 A(258982) 98.4 G(4268) 1.6
34 A(213486) 97.6 G(4869) 2.2 T(283) 0.1 34 A(251010) 95.4 G(12240) 4.6
35 A(200064) 91.5 G(18416) 8.4 T(158) 0.1 35 A(230072) 87.4 G(32799) 12.5 C(379) 0.1
36 G(213967) 97.9 A(4520) 2.1 T(151) 0.1 36 G(263250) 100
37 G(218436) 99.9 A(202) 0.1 37 G(263250) 100
38 C(218325) 99.9 T(313) 0.1 38 C(263250) 100
39 T(177016) 81 C(41622) 19 39 T(209299) 79.5 C(53951) 20.5
40 A(217191) 99.3 G(1447) 0.7 40 A(257480) 97.8 G(5770) 2.2
41 T(217956) 99.7 C(682) 0.3 41 T(260947) 99.1 C(2303) 0.9
42 G(215988) 98.8 A(2650) 1.2 42 G(263194) 100 A(56) 0
43 G(215891) 98.7 A(2747) 1.3 43 G(262761) 99.8 A(489) 0.2
LZH7: 432544 Sequences, 23.04%
Barcode: CTTACACCAC, Z to T/C conversion; P to A/G conversion. Barcode: GATGATTGCC, Z to C and P to G conversion.
Position Main Base Main % 2nd Base 2nd % 3rd Base 3rd % 4th Base 4th % Position Main Base Main % 2nd Base 2nd % 3rd Base 3rd % 4th Base 4th % Z or P Conversion
19 C(188949) 99.8 T(454) 0.2 19 C(242980) 99.9 T(161) 0.1
20 A(186938) 98.7 G(2465) 1.3 20 A(237130) 97.5 G(6011) 2.5
21 A(188816) 99.7 G(587) 0.3 21 A(241309) 99.2 G(1832) 0.8
22 T(188662) 99.6 C(741) 0.4 22 T(240847) 99.1 C(2294) 0.9
23 A(188781) 99.7 G(622) 0.3 23 A(241365) 99.3 G(1776) 0.7
24 A(188587) 99.6 G(816) 0.4 24 A(240603) 99 G(2538) 1
25 T(187208) 98.8 C(2195) 1.2 25 T(236650) 97.3 C(6491) 2.7
26 T(183594) 96.9 C(5809) 3.1 26 T(227366) 93.5 C(15775) 6.5
27 C(189184) 99.9 T(219) 0.1 27 C(243061) 100 T(80) 0
28 T(187004) 98.7 C(2399) 1.3 28 T(236971) 97.5 C(6114) 2.5 G(56) 0
29 G(110955) 58.6 A(77394) 40.9 T(1054) 0.6 29 G(241948) 99.5 A(1193) 0.5 P Conversion
30 G(188982) 99.8 A(421) 0.2 30 G(243001) 99.9 A(140) 0.1
31 C(189125) 99.9 T(278) 0.1 31 C(243087) 100 T(54) 0
32 C(103017) 54.4 T(81831) 43.2 A(4555) 2.4 32 C(238740) 98.2 T(4114) 1.7 A(287) 0.1 Z Conversion
33 G(189211) 99.9 A(192) 0.1 33 G(243141) 100
34 C(188737) 99.6 T(666) 0.4 34 C(242958) 99.9 T(183) 0.1
35 G(187228) 98.9 A(2122) 1.1 T(53) 0 35 G(242949) 99.9 A(118) 0 C(74) 0
36 G(157778) 83.3 A(31625) 16.7 36 G(209921) 86.3 A(33220) 13.7
37 T(188174) 99.4 C(1229) 0.6 37 T(237857) 97.8 C(5284) 2.2
38 A(188299) 99.4 G(1104) 0.6 38 A(237997) 97.9 G(5144) 2.1
39 T(187201) 98.8 C(2202) 1.2 39 T(237596) 97.7 C(5545) 2.3
40 T(188208) 99.4 C(1195) 0.6 40 T(239318) 98.4 C(3612) 1.5 G(153) 0.1 A(58) 0
41 G(180107) 95.1 A(9015) 4.8 T(281) 0.1 41 G(237715) 97.8 A(5426) 2.2
42 G(168162) 88.8 A(20920) 11 T(321) 0.2 42 G(221312) 91 A(21691) 8.9 T(81) 0 C(57) 0
43 G(186394) 98.4 A(2811) 1.5 C(198) 0.1 43 G(242236) 99.6 A(588) 0.2 C(317) 0.1
LZH3: 147144 Sequences, 7.84%
Barcode: CTTACACCAC, Z to T/C conversion; P to A/G conversion. Barcode: GATGATTGCC, Z to C and P to G conversion.
Position Main Base Main % 2nd Base 2nd % 3rd Base 3rd % 4th Base 4th % Position Main Base Main % 2nd Base 2nd % 3rd Base 3rd % 4th Base 4th % Z or P Conversion
19 C(63606) 99.7 T(223) 0.3 19 C(83133) 99.8 T(115) 0.1 G(67) 0.1
20 G(63407) 99.3 A(422) 0.7 20 G(83315) 100
21 A(63355) 99.3 G(474) 0.7 21 A(81896) 98.3 G(1419) 1.7
22 C(63053) 98.8 T(776) 1.2 22 C(82423) 98.9 T(892) 1.1
23 C(63755) 99.9 T(74) 0.1 23 C(83315) 100
24 C(38982) 61.1 T(23973) 37.6 A(874) 1.4 24 C(82553) 99.1 T(653) 0.8 A(109) 0.1 Z Conversion
25 G(63829) 100 25 G(83315) 100
26 A(63547) 99.6 G(282) 0.4 26 A(82050) 98.5 G(1265) 1.5
27 C(63829) 100 27 C(83315) 100
28 T(62230) 97.5 C(1599) 2.5 28 T(79074) 94.9 C(4241) 5.1
29 T(61883) 97 C(1946) 3 29 T(78389) 94.1 C(4926) 5.9
30 T(50127) 78.5 C(13702) 21.5 30 T(62089) 74.5 C(21226) 25.5
31 T(62966) 98.6 C(863) 1.4 31 T(81318) 97.6 C(1997) 2.4
32 A(63046) 98.8 G(783) 1.2 32 A(81081) 97.3 G(2234) 2.7
33 G(63829) 100 33 G(83315) 100
34 C(63829) 100 34 C(83315) 100
35 A(45474) 71.2 G(18241) 28.6 T(114) 0.2 35 G(81449) 97.8 A(1866) 2.2 P Conversion
36 T(63531) 99.5 C(298) 0.5 36 T(81882) 98.3 C(1433) 1.7
37 C(63501) 99.5 T(328) 0.5 37 C(83315) 100
38 G(63770) 99.9 A(59) 0.1 38 G(83315) 100
39 A(60433) 94.7 G(3396) 5.3 39 A(76682) 92 G(6633) 8
40 A(63062) 98.8 G(767) 1.2 40 A(79725) 95.7 G(3590) 4.3
41 T(63699) 99.8 C(130) 0.2 41 T(82803) 99.4 C(512) 0.6
42 A(61297) 96 G(2532) 4 42 A(75878) 91.1 G(7437) 8.9
43 G(63206) 99 A(623) 1 43 G(83255) 99.9 A(60) 0.1
S21
Table S8 Dissociation constants (Kd) measured on untransformed liver cells
Note: N.A.: not available.
Name Kd (nM)
LZH1 16±6
LZH4 11±7
LZH5 N.A.
LZH6 9±3
LZH7 55±23
S22
Table S9 Binding and specificity of selected Z/P contained aptamers
LZH1 LZH2 LZH3 LZH4 LZH5 LZH6 LZH7 LZH8 LZH9 LZH11 LZH12
HepG2 ++++ +++ ++++ ++++ ++++ ++++ ++++ ++++ ++ ++++ ++
Hu1545V + - - + + + + - - - -
HuH-7 + + + + ++ + + - + + +
HuH-7.5 - + - + - + + + + + -
LH86 + + + + - + + - - + -
THLE-2 + - - + - + + - - - -
HEK-293 - - - - - - + - - - -
HeLa - - - - - - + - - - +
MCF-7 + - + + - + + - - + +
TOV-21G - - - - - - - - - - -
PL45 - - - + ++ + + - - - -
MDA-MB-231 + - - - - - + - - - -
K-562 - - - - - - - - - - -
A549 - - - - - - - - ++ - -
H226 - - - - - - - - - - -
Ramos - - - - - - - - - - -
CCRF-CEM - - - - - - - - - - -
*A threshold based on the fluorescence intensity of FITC in the flow cytometric analysis was chosen so that 99% of cells incubated with the FITC-labeled unselected DNA library would have lower fluorescence intensity than the threshold. When the FITC-labeled aptamer was allowed to interact with the cells, the percentage of the cells with fluorescence above the set threshold was used to evaluate the binding capacity of the aptamer to the cells. 0-10% -, 10–35% +, 35–60% ++, 60–85% +++, .85% ++++. ** Target cell and liver cancer (HepG2), counter cell and normal liver cell (Hu1545V), liver cancer (HuH-7, HuH-7.5, LH86), normal liver cell (THLE-2), embryonic kidney cell (HEK-293), cervical cancer (HeLa), breast cancer (MCF-7, MDA-MB-231), ovarian cancer (TOV21G), pancreatic cancer (PL45), lung cancer (H226, A549), leukemia (K-562, Ramos, CCRF-CEM)
S23
Table S10 Sequences (only showing the randomized region) and dissociation
constants (Kd) of LZH3 and LZH7 with the indicated AEGIS nucleotides (red)
replaced by standard nucleotides (blue)
Name Sequences Kd (nM)
LZH3 ~CGACCZGACTTTTAGCPTCGAATAG~ 24±5
LZH3-1 ~CGACCCGACTTTTAGCGTCGAATAG~ 119±34
LZH3-2 ~CGACCTGACTTTTAGCATCGAATAG~ 59±26
LZH3-3 ~CGACCCGACTTTTAGCPTCGAATAG~ 131±21
LZH3-4 ~CGACCTGACTTTTAGCPTCGAATAG~ 234±48
LZH3-5 ~CGACCZGACTTTTAGCGTCGAATAG~ 122±27
LZH3-6 ~CGACCZGACTTTTAGCATCGAATAG~ 85±12
LZH7 ~CAATAATTCTPGCZGCGGTATTGGG~ 55±8
LZH7-1 ~CAATAATTCTGGCCGCGGTATTGGG~ 216±29
LZH7-2 ~CAATAATTCTAGCTGCGGTATTGGG~ 186±73
LZH7-3 ~CAATAATTCTPGCCGCGGTATTGGG~ 280±74
LZH7-4 ~CAATAATTCTPGCTGCGGTATTGGG~ 358±162
LZH7-5 ~CAATAATTCTGGCZGCGGTATTGGG~ 123±48
LZH7-6 ~CAATAATTCTAGCZGCGGTATTGGG~ 270±127
S24
References
(1) Lin, L.; Sheng, J.; Huang, Z. Chem. Soc. Rev. 2011, 40, 4591.
(2) Hassan, A. E.; Sheng, J.; Jiang, J.; Zhang, W.; Huang, Z. Org. Lett. 2009, 11, 2503.
(3) Guckian, K. M.; Krugh, T. R.; Kool, E. T. J. Am. Chem. Soc. 2000, 122, 6841.
(4) Betz, K.; Malyshev, D. A.; Lavergne, T.; Welte, W.; Diederichs, K.; Romesberg, F. E.;
Marx, A. J. Am. Chem. Soc. 2013, 135, 18637.
(5) Kimoto, M.; Yamashige, R.; Matsunaga, K.; Yokoyama, S; Hirao, I. Nat. Biotechnol.
2013, 31, 453.
(6) Yamashige, R.; Kimoto, M.; Takezawa, Y.; Sato, A.; Mitsui, T.; Yokoyama, S.; Hirao,
I. Nucleic Acids Res. 2012, 40, 2793.
Top Related