Post on 18-Dec-2015
11
DNA Computation: The Secret DNA Computation: The Secret of Life as Non-Living of Life as Non-Living
TechnologyTechnology
Russell DeatonProfessorComp. Science & EngineeringThe University of ArkansasFayetteville, AR 72701rdeaton@uark.edu
33
PPHYSICALHYSICAL S STRUCTURE OF TRUCTURE OF DNADNA
Nitrogenous Base
34 Å
MajorGroove
Minor Groove
Central Axis
Sugar-PhosphateBackbone
20 Å5’ C
3’ OH
3’ 0HC 5’
5’
3’
3’
5’
44
DNA in the CellDNA in the Cell
Stored in Number of Chromosomes (24 in Stored in Number of Chromosomes (24 in Human Genome)Human Genome)
Tightly coiled threads of DNA and Tightly coiled threads of DNA and Associated ProteinsAssociated Proteins
3 billion bp in Human Genome: Total 3 billion bp in Human Genome: Total genetic content of cellgenetic content of cell
99
Genome SizesGenome Sizes
Organism Genome Size (Bases) GenesHuman 3 billion 30,000Laboratory mouse 2.6 billion 30,000Mustard weed 100 million 25,000Roundworm 97 million 19,000Fruit fly 137 million 13,000Yeast 12.1 million 6,000Bacterium 4.6 million 3,200HIV 9700 9
1010
DNA for Non-Biological DNA for Non-Biological PurposesPurposes
Encode Abiotic Information in Encode Abiotic Information in DNA SequencesDNA Sequences Graphs and StructureGraphs and Structure Computational ProblemsComputational Problems Database and MemoryDatabase and Memory
Search and/or Assemble Through DNA-to-Search and/or Assemble Through DNA-to-DNA Template-matching reactionsDNA Template-matching reactions
Enzymatic Operations for Further Enzymatic Operations for Further Information ProcessingInformation Processing
1111
PPHYSICALHYSICAL S STRUCTURE OF TRUCTURE OF DNADNA
Nitrogenous Base
34 Å
MajorGroove
Minor Groove
Central Axis
Sugar-PhosphateBackbone
20 Å5’ C
3’ OH
3’ 0HC 5’
5’
3’
3’
5’
1212
Template Matching Template Matching Hybridization ReactionHybridization Reaction
` A-C-A-A-C-G
T-G-T-T-G-C’
` A-C-A-A-C-G
T-G-T-T-G-C’
1313
Hybridization Allows:Hybridization Allows:
Massively Parallel Search based on Massively Parallel Search based on Watson-Crick ComplementsWatson-Crick Complements
Directed Self-Assembly of Directed Self-Assembly of NanostructuresNanostructures
Search Stored Information for Similar Search Stored Information for Similar Sequence ContentSequence Content
1414
Differences to BiologyDifferences to Biology
No Proof-reading enzymes No Proof-reading enzymes Cell versus the Test TubeCell versus the Test Tube Technology Driven by Template-matching Technology Driven by Template-matching
reactions between relatively short reactions between relatively short oligonucleotidesoligonucleotides
Biology: DNA primarily duplexBiology: DNA primarily duplex Abiotic: DNA primarily single-strandedAbiotic: DNA primarily single-stranded
1616
AlgorithmAlgorithm Generate Random Paths through the graph.Generate Random Paths through the graph. Keep only those paths that begin with vKeep only those paths that begin with v inin and end and end
with vwith voutout.. If graph has n vertices, then keep only those paths If graph has n vertices, then keep only those paths
that enter exactly n vertices.that enter exactly n vertices. Keep only those paths that enter all the vertices at Keep only those paths that enter all the vertices at
least once.least once. In any paths remain, say “Yes”; otherwise, say In any paths remain, say “Yes”; otherwise, say
“No”“No”
1717
Representing a Graph with Representing a Graph with SequencesSequences
0
1
2
‘GCATGGCC
‘AGCTTAGG
‘ATGGCATG
CCGGTCGA’
CCGGTACC’
‘GCATGGCCAGCTTAGG CCGGTCGA’
‘GCATGGCCATGGCATG CCGGTACC’
00 21
1818
Massively Parallel SearchMassively Parallel SearchV1
E0->1
V0 V2 V3 V4 V5 V6
E1->2 E2->3 E3->4 E4->5 E5->6
V6
E0->6
V0
V3
E0->3
V0 V2 V3 V4 V5 V6
E3->2 E2->3 E3->4 E4->5 E5->6
V5
E4->5
V4 V1 V2
E5->1 E1->2
1919
AlgorithmAlgorithm Generate Random Paths through the graph.Generate Random Paths through the graph. Keep only those paths that begin with vKeep only those paths that begin with v inin and end and end
with vwith voutout.. If graph has n vertices, then keep only those paths If graph has n vertices, then keep only those paths
that enter exactly n vertices.that enter exactly n vertices. Keep only those paths that enter all the vertices at Keep only those paths that enter all the vertices at
least once.least once. In any paths remain, say “Yes”; otherwise, say In any paths remain, say “Yes”; otherwise, say
“No”“No”
2222
Start = V0, Stop = V6Start = V0, Stop = V6V1
E0->1
V0 V2 V3 V4 V5 V6
E1->2 E2->3 E3->4 E4->5 E5->6
V6
E0->6
V0
V3
E0->3
V0 V2 V3 V4 V5 V6
E3->2 E2->3 E3->4 E4->5 E5->6
V5
E4->5
V4 V1 V2
E5->1 E1->2
2323
AlgorithmAlgorithm Generate Random Paths through the graph.Generate Random Paths through the graph. Keep only those paths that begin with vKeep only those paths that begin with v inin and end and end
with vwith voutout.. If graph has n vertices, then keep only those paths If graph has n vertices, then keep only those paths
that enter exactly n vertices.that enter exactly n vertices. Keep only those paths that enter all the vertices at Keep only those paths that enter all the vertices at
least once.least once. In any paths remain, say “Yes”; otherwise, say In any paths remain, say “Yes”; otherwise, say
“No”“No”
2424
GGELEL E ELECTROPHORESIS - SIZE SORTINGLECTROPHORESIS - SIZE SORTING
BufferGel
Electrode
Electrode
Samples
Faster
Slower
2525
Right LengthRight LengthV1
E0->1
V0 V2 V3 V4 V5 V6
E1->2 E2->3 E3->4 E4->5 E5->6
V6
E0->6
V0
V3
E0->3
V0 V2 V3 V4 V5 V6
E3->2 E2->3 E3->4 E4->5 E5->6
2626
AlgorithmAlgorithm Generate Random Paths through the graph.Generate Random Paths through the graph. Keep only those paths that begin with vKeep only those paths that begin with v inin and end and end
with vwith voutout.. If graph has n vertices, then keep only those paths If graph has n vertices, then keep only those paths
that enter exactly n vertices.that enter exactly n vertices. Keep only those paths that enter all the vertices at Keep only those paths that enter all the vertices at
least once.least once. In any paths remain, say “Yes”; otherwise, say In any paths remain, say “Yes”; otherwise, say
“No”“No”
2727
AANTIBODYNTIBODY A AFFINITYFFINITY
CACCATGTGAC
GTGGTACACTG B
PMP
+
Anneal
CACCATGTGAC
GTGGTACACTG B+
CACCATGTGAC
GTGGTACACTG B PMP
Bind
Add oligo withBiotin label
Heat and cool
Add Paramagnetic-Streptavidin
Particles
Isolate with MagnetN
S
2828
Every VertexEvery VertexV1
E0->1
V0 V2 V3 V4 V5 V6
E1->2 E2->3 E3->4 E4->5 E5->6
V3
E0->3
V0 V2 V3 V4 V5 V6
E3->2 E2->3 E3->4 E4->5 E5->6
2929
AlgorithmAlgorithm Generate Random Paths through the graph.Generate Random Paths through the graph. Keep only those paths that begin with vKeep only those paths that begin with v inin and end and end
with vwith voutout.. If graph has n vertices, then keep only those paths If graph has n vertices, then keep only those paths
that enter exactly n vertices.that enter exactly n vertices. Keep only those paths that enter all the vertices at Keep only those paths that enter all the vertices at
least once.least once. In any paths remain, say “Yes”; otherwise, say In any paths remain, say “Yes”; otherwise, say
“No”“No”
3333
DNA Word Design ConstraintsDNA Word Design Constraints
Sequence design should implement the Sequence design should implement the architecture.architecture. Planned HybridizationsPlanned Hybridizations Problem SizeProblem Size Subsequent Processing ReactionsSubsequent Processing Reactions
Designed sequences should minimize Designed sequences should minimize unplanned “cross-hybridizations.”unplanned “cross-hybridizations.”
Consequences of Bad Designs: Errors and Consequences of Bad Designs: Errors and Poor EfficiencyPoor Efficiency
3434
DNA Word DesignDNA Word Design
Design problem is hard (NP-Complete).Design problem is hard (NP-Complete). As number of sequences required to represent As number of sequences required to represent
the problem increases, this constraints the problem increases, this constraints increasingly conflicts with the requirement of increasingly conflicts with the requirement of non-crosshybridization.non-crosshybridization.
How much of DNA sequence space is How much of DNA sequence space is available for computation and assembly?available for computation and assembly?
4242
TeamTeam
Russell Deaton, Weixia Yu, Maryam Nuser, Chris Harris, Russell Deaton, Weixia Yu, Maryam Nuser, Chris Harris, University of Arkansas, Computer Science and EngineeringUniversity of Arkansas, Computer Science and Engineering
Junghuei Chen, Hong Bi, Yu-Zhen Wang, University of Junghuei Chen, Hong Bi, Yu-Zhen Wang, University of Delaware, Chemistry and BiochemistryDelaware, Chemistry and Biochemistry
Jin-Woo Kim, Dylan Carpenter, Ju Seok Lee, University of Jin-Woo Kim, Dylan Carpenter, Ju Seok Lee, University of Arkansas, Biological EngineeringArkansas, Biological Engineering
Max Garzon, University of Memphis, Computer ScienceMax Garzon, University of Memphis, Computer Science Harvey Rubin, University of Pennsylvania, School of Harvey Rubin, University of Pennsylvania, School of
MedicineMedicine David Wood, University of Delaware, Computer and David Wood, University of Delaware, Computer and
Information ScienceInformation Science