1 DNA Computation: The Secret of Life as Non-Living Technology Russell Deaton Professor Comp....

43
1 DNA Computation: The DNA Computation: The Secret of Life as Non- Secret of Life as Non- Living Technology Living Technology Russell Deaton Professor Comp. Science & Engineering The University of Arkansas Fayetteville, AR 72701 [email protected]
  • date post

    18-Dec-2015
  • Category

    Documents

  • view

    219
  • download

    0

Transcript of 1 DNA Computation: The Secret of Life as Non-Living Technology Russell Deaton Professor Comp....

11

DNA Computation: The Secret DNA Computation: The Secret of Life as Non-Living of Life as Non-Living

TechnologyTechnology

Russell DeatonProfessorComp. Science & EngineeringThe University of ArkansasFayetteville, AR [email protected]

22

We have discovered the secret of life!-Francis Crick, Feb. 28, 1953

33

PPHYSICALHYSICAL S STRUCTURE OF TRUCTURE OF DNADNA

Nitrogenous Base

34 Å

MajorGroove

Minor Groove

Central Axis

Sugar-PhosphateBackbone

20 Å5’ C

3’ OH

3’ 0HC 5’

5’

3’

3’

5’

44

DNA in the CellDNA in the Cell

Stored in Number of Chromosomes (24 in Stored in Number of Chromosomes (24 in Human Genome)Human Genome)

Tightly coiled threads of DNA and Tightly coiled threads of DNA and Associated ProteinsAssociated Proteins

3 billion bp in Human Genome: Total 3 billion bp in Human Genome: Total genetic content of cellgenetic content of cell

55

66

Genetic CodeGenetic Code

77

88

99

Genome SizesGenome Sizes

Organism Genome Size (Bases) GenesHuman 3 billion 30,000Laboratory mouse 2.6 billion 30,000Mustard weed 100 million 25,000Roundworm 97 million 19,000Fruit fly 137 million 13,000Yeast 12.1 million 6,000Bacterium 4.6 million 3,200HIV 9700 9

1010

DNA for Non-Biological DNA for Non-Biological PurposesPurposes

Encode Abiotic Information in Encode Abiotic Information in DNA SequencesDNA Sequences Graphs and StructureGraphs and Structure Computational ProblemsComputational Problems Database and MemoryDatabase and Memory

Search and/or Assemble Through DNA-to-Search and/or Assemble Through DNA-to-DNA Template-matching reactionsDNA Template-matching reactions

Enzymatic Operations for Further Enzymatic Operations for Further Information ProcessingInformation Processing

1111

PPHYSICALHYSICAL S STRUCTURE OF TRUCTURE OF DNADNA

Nitrogenous Base

34 Å

MajorGroove

Minor Groove

Central Axis

Sugar-PhosphateBackbone

20 Å5’ C

3’ OH

3’ 0HC 5’

5’

3’

3’

5’

1212

Template Matching Template Matching Hybridization ReactionHybridization Reaction

` A-C-A-A-C-G

T-G-T-T-G-C’

` A-C-A-A-C-G

T-G-T-T-G-C’

1313

Hybridization Allows:Hybridization Allows:

Massively Parallel Search based on Massively Parallel Search based on Watson-Crick ComplementsWatson-Crick Complements

Directed Self-Assembly of Directed Self-Assembly of NanostructuresNanostructures

Search Stored Information for Similar Search Stored Information for Similar Sequence ContentSequence Content

1414

Differences to BiologyDifferences to Biology

No Proof-reading enzymes No Proof-reading enzymes Cell versus the Test TubeCell versus the Test Tube Technology Driven by Template-matching Technology Driven by Template-matching

reactions between relatively short reactions between relatively short oligonucleotidesoligonucleotides

Biology: DNA primarily duplexBiology: DNA primarily duplex Abiotic: DNA primarily single-strandedAbiotic: DNA primarily single-stranded

1515

How to encode a graph?How to encode a graph?

1616

AlgorithmAlgorithm Generate Random Paths through the graph.Generate Random Paths through the graph. Keep only those paths that begin with vKeep only those paths that begin with v inin and end and end

with vwith voutout.. If graph has n vertices, then keep only those paths If graph has n vertices, then keep only those paths

that enter exactly n vertices.that enter exactly n vertices. Keep only those paths that enter all the vertices at Keep only those paths that enter all the vertices at

least once.least once. In any paths remain, say “Yes”; otherwise, say In any paths remain, say “Yes”; otherwise, say

“No”“No”

1717

Representing a Graph with Representing a Graph with SequencesSequences

0

1

2

‘GCATGGCC

‘AGCTTAGG

‘ATGGCATG

CCGGTCGA’

CCGGTACC’

‘GCATGGCCAGCTTAGG CCGGTCGA’

‘GCATGGCCATGGCATG CCGGTACC’

00 21

1818

Massively Parallel SearchMassively Parallel SearchV1

E0->1

V0 V2 V3 V4 V5 V6

E1->2 E2->3 E3->4 E4->5 E5->6

V6

E0->6

V0

V3

E0->3

V0 V2 V3 V4 V5 V6

E3->2 E2->3 E3->4 E4->5 E5->6

V5

E4->5

V4 V1 V2

E5->1 E1->2

1919

AlgorithmAlgorithm Generate Random Paths through the graph.Generate Random Paths through the graph. Keep only those paths that begin with vKeep only those paths that begin with v inin and end and end

with vwith voutout.. If graph has n vertices, then keep only those paths If graph has n vertices, then keep only those paths

that enter exactly n vertices.that enter exactly n vertices. Keep only those paths that enter all the vertices at Keep only those paths that enter all the vertices at

least once.least once. In any paths remain, say “Yes”; otherwise, say In any paths remain, say “Yes”; otherwise, say

“No”“No”

2020

DNA Polymerase

2121

POLYMERASE POLYMERASE CHAIN CHAIN

REACTIONREACTION

2222

Start = V0, Stop = V6Start = V0, Stop = V6V1

E0->1

V0 V2 V3 V4 V5 V6

E1->2 E2->3 E3->4 E4->5 E5->6

V6

E0->6

V0

V3

E0->3

V0 V2 V3 V4 V5 V6

E3->2 E2->3 E3->4 E4->5 E5->6

V5

E4->5

V4 V1 V2

E5->1 E1->2

2323

AlgorithmAlgorithm Generate Random Paths through the graph.Generate Random Paths through the graph. Keep only those paths that begin with vKeep only those paths that begin with v inin and end and end

with vwith voutout.. If graph has n vertices, then keep only those paths If graph has n vertices, then keep only those paths

that enter exactly n vertices.that enter exactly n vertices. Keep only those paths that enter all the vertices at Keep only those paths that enter all the vertices at

least once.least once. In any paths remain, say “Yes”; otherwise, say In any paths remain, say “Yes”; otherwise, say

“No”“No”

2424

GGELEL E ELECTROPHORESIS - SIZE SORTINGLECTROPHORESIS - SIZE SORTING

BufferGel

Electrode

Electrode

Samples

Faster

Slower

2525

Right LengthRight LengthV1

E0->1

V0 V2 V3 V4 V5 V6

E1->2 E2->3 E3->4 E4->5 E5->6

V6

E0->6

V0

V3

E0->3

V0 V2 V3 V4 V5 V6

E3->2 E2->3 E3->4 E4->5 E5->6

2626

AlgorithmAlgorithm Generate Random Paths through the graph.Generate Random Paths through the graph. Keep only those paths that begin with vKeep only those paths that begin with v inin and end and end

with vwith voutout.. If graph has n vertices, then keep only those paths If graph has n vertices, then keep only those paths

that enter exactly n vertices.that enter exactly n vertices. Keep only those paths that enter all the vertices at Keep only those paths that enter all the vertices at

least once.least once. In any paths remain, say “Yes”; otherwise, say In any paths remain, say “Yes”; otherwise, say

“No”“No”

2727

AANTIBODYNTIBODY A AFFINITYFFINITY

CACCATGTGAC

GTGGTACACTG B

PMP

+

Anneal

CACCATGTGAC

GTGGTACACTG B+

CACCATGTGAC

GTGGTACACTG B PMP

Bind

Add oligo withBiotin label

Heat and cool

Add Paramagnetic-Streptavidin

Particles

Isolate with MagnetN

S

2828

Every VertexEvery VertexV1

E0->1

V0 V2 V3 V4 V5 V6

E1->2 E2->3 E3->4 E4->5 E5->6

V3

E0->3

V0 V2 V3 V4 V5 V6

E3->2 E2->3 E3->4 E4->5 E5->6

2929

AlgorithmAlgorithm Generate Random Paths through the graph.Generate Random Paths through the graph. Keep only those paths that begin with vKeep only those paths that begin with v inin and end and end

with vwith voutout.. If graph has n vertices, then keep only those paths If graph has n vertices, then keep only those paths

that enter exactly n vertices.that enter exactly n vertices. Keep only those paths that enter all the vertices at Keep only those paths that enter all the vertices at

least once.least once. In any paths remain, say “Yes”; otherwise, say In any paths remain, say “Yes”; otherwise, say

“No”“No”

3030

Hamiltonian PathHamiltonian PathV1

E0->1

V0 V2 V3 V4 V5 V6

E1->2 E2->3 E3->4 E4->5 E5->6

3131

MismatchesMismatches

3232

ErrorsErrors

3333

DNA Word Design ConstraintsDNA Word Design Constraints

Sequence design should implement the Sequence design should implement the architecture.architecture. Planned HybridizationsPlanned Hybridizations Problem SizeProblem Size Subsequent Processing ReactionsSubsequent Processing Reactions

Designed sequences should minimize Designed sequences should minimize unplanned “cross-hybridizations.”unplanned “cross-hybridizations.”

Consequences of Bad Designs: Errors and Consequences of Bad Designs: Errors and Poor EfficiencyPoor Efficiency

3434

DNA Word DesignDNA Word Design

Design problem is hard (NP-Complete).Design problem is hard (NP-Complete). As number of sequences required to represent As number of sequences required to represent

the problem increases, this constraints the problem increases, this constraints increasingly conflicts with the requirement of increasingly conflicts with the requirement of non-crosshybridization.non-crosshybridization.

How much of DNA sequence space is How much of DNA sequence space is available for computation and assembly?available for computation and assembly?

3535

3636

3737

Library SizesLibrary Sizes

3838

3939

Nanotechnology CodeNanotechnology Code

4040

First StepFirst Step

4141

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

4242

TeamTeam

Russell Deaton, Weixia Yu, Maryam Nuser, Chris Harris, Russell Deaton, Weixia Yu, Maryam Nuser, Chris Harris, University of Arkansas, Computer Science and EngineeringUniversity of Arkansas, Computer Science and Engineering

Junghuei Chen, Hong Bi, Yu-Zhen Wang, University of Junghuei Chen, Hong Bi, Yu-Zhen Wang, University of Delaware, Chemistry and BiochemistryDelaware, Chemistry and Biochemistry

Jin-Woo Kim, Dylan Carpenter, Ju Seok Lee, University of Jin-Woo Kim, Dylan Carpenter, Ju Seok Lee, University of Arkansas, Biological EngineeringArkansas, Biological Engineering

Max Garzon, University of Memphis, Computer ScienceMax Garzon, University of Memphis, Computer Science Harvey Rubin, University of Pennsylvania, School of Harvey Rubin, University of Pennsylvania, School of

MedicineMedicine David Wood, University of Delaware, Computer and David Wood, University of Delaware, Computer and

Information ScienceInformation Science

4343

AcknowledgementAcknowledgement

This work was supported by the NSF QuBIC This work was supported by the NSF QuBIC program, award number EIA-0130385program, award number EIA-0130385