Coding For DNA Computing: Combinatorial and Biophysical Aspects

39
Coding for DNA Computing: Combinatorial and Biophysical Aspects Olgica Milenkovic University of Colorado, Boulder A Joint Work with Navin Kashyap Queen’s University, Kingston

Transcript of Coding For DNA Computing: Combinatorial and Biophysical Aspects

Page 1: Coding For DNA Computing: Combinatorial and Biophysical Aspects

Coding for DNA Computing:

Combinatorial and Biophysical Aspects

Olgica MilenkovicUniversity of Colorado, Boulder

A Joint Work with Navin KashyapQueen’s University, Kingston

Page 2: Coding For DNA Computing: Combinatorial and Biophysical Aspects

LDPCITERATIVE DECODING

Page 3: Coding For DNA Computing: Combinatorial and Biophysical Aspects

Outline The DNA Computing Paradigm Applications Error-Control Coding for DNA Computing

Constrained Coding: DNA Secondary and Tertiary Structure

Statistical Mechanics of DNA/RNA Folding Results and Open Problems

Page 4: Coding For DNA Computing: Combinatorial and Biophysical Aspects

Molecular Biology: Terminology DNA Double Helix

Watson-Crick Complements: A→T, G →C, T →A, C →G RNA: Single-Stranded, T Replaced by U Helix Denaturation (Ambient Temperature

Governed) DNA Oligonucleotide Sequences DNA Hybridization DNA Enzymes: Functional Proteins Operating on

DNA

Page 5: Coding For DNA Computing: Combinatorial and Biophysical Aspects

DNA Computing: Adleman’s Experiment (1994)The Problem: An “Unremarkable” Instance of the Directed Traveling Salesmen Problem on a Graph with Seven NodesFigures from Adleman, SA 1998

The Method: Remarkable Oligonucleotide DNA Hybridization TechniqueMiami (CTACGG) Miami (CTACGG) NY (ATGCCG) NY (ATGCCG) Route (Edge): Second Half of Codeword for Miami (CGG) and Route (Edge): Second Half of Codeword for Miami (CGG) and First Half of Codeword for NY (ATG): First Half of Codeword for NY (ATG): CGGATG --- Take the Complement of this Word: GCCTACCGGATG --- Take the Complement of this Word: GCCTAC

Page 6: Coding For DNA Computing: Combinatorial and Biophysical Aspects

Not a von Neumann Architecture: Stochastic Mechanism with Massive Parallelism: 1/50th of Teaspoon, 1014paths/1s

Extremely Low Power Consumption: 1 Joule for 2 · 1019 Operations

Storage Capacity: Vol(1g of DNA)=1cm3 , Information=1 trillion CDs18Mb/inch of Length (0.35nm Between Base Pairs)

Versatility of Applications, Only Plausible Option in Many Cases

Drawbacks: First Implementations not Interactive 3-Day Processing Delay VERY LOW RELIABILITY OF COMPUTATION

DNA Computing: The Benefits

Page 7: Coding For DNA Computing: Combinatorial and Biophysical Aspects

Applications of DNA Computers Combinatorial Problems: Directed Traveling Salesmen (Adleman ‘94) 3-SAT (Braich et.al., ‘02)

Input: a 20-Variable, 24-Clause, Boolean Function 3-Conjunctive Normal Form (3-CNF) For each Variable, two Length=15 DNA Sequences Assigned, one representing the Variable, the other representing its Complement Operon Technology, Alameda, CA, Integrated DNA Technologies, Skokie, IL

Non-Attacking Knights (Faulhammer, ’00)Configurations of Knights that can be Placed on n×n Chess Board so that no Knight is Attacking any other Knight on the Board

Figure

Page 8: Coding For DNA Computing: Combinatorial and Biophysical Aspects

Novel Designs of DNA Computers

DNA Logic and Automata: Interactive Systems DNA Transistors (Stojanovic, Stefanovic ‘03) DNA Game-Playing Machines (Stojanovic, Stefanovic ‘03)

MAYA: Consists of Nine Wells (Tubes) Representing the 3x3 Tic-Tac-Toe BoardTubes Contain Mixtures of Enzymes: Network of 23 Molecular Logic Gates“Human Player” has Nine Different DNA Strands: each Specific to one Square on the Board; Player Selects one Square to Play: DNA Strand representing that Square gets Added to all the Nine Wells;

O

MAYA “Analyzes” Play Through Biochemical Reactions Occurring in Wells

Page 9: Coding For DNA Computing: Combinatorial and Biophysical Aspects

Meet MAYA…(Stojanovic, Stefanovic 2003)

Applications of DNA Computers

Figure: http://www.cs.unm.edu/~bandrews/ttt-applet/

Page 10: Coding For DNA Computing: Combinatorial and Biophysical Aspects

The “Killer Application”: SMART DRUGS

E. Shapiro et.al. (Weizmann Institute, Israel), Nature, Science 2003Quintana et.al 2002In Vitro DNA-Based Computer “Programmed” to Diagnose Cancer and “Order” Self-Destruction of Cells

Identifies RNA Cancer Fingerprint Molecules Cancer Leaves its own “Chemical Fingerprint” in the Body, Including Over-Producing or Under-Producing Specific RNA Sequences(Analysis Based on Regulatory Networks of Gene Interactions, Shmulevich et. al., 2002)(Milenkovic and Vasic, DIMACS’2004, ITW’2004)

Software: DNA, Hardware: DNA EnzymesResponds Appropriately by Releasing Short, Active DNA Strand

Interferes with Tumors by Suppressing Key Cancer Genes, Making Diseased Cells Self-Destruct

Experiments: Prostate and Lung Cancer Cells

Applications of DNA Computers

Page 11: Coding For DNA Computing: Combinatorial and Biophysical Aspects

Sensing, Storing, Nano-Scale Mechanics… Biosensing: DNA Fingerprinting of Bacteria/Viruses, Roco

et.al. 2004 DNA-Based Storage Systems: Mansuripur et.al.,

DIMACS’2004 Nucleic Acid Nanostructures and Topology, DNA Self-

Assembly, DNA Nanoscale Mechanical Devices, Seeman et.al. 1998-2002

Applications of DNA Computers

RELIABILITY ISSUES FOR ALL DESCRIBED SYSTEMS UNRESOLVED

Error Control Coding

Constrained Coding

Graph Theory/Combinatorics/Pseudo-Knot Theory

Statistical Mechanics

Page 12: Coding For DNA Computing: Combinatorial and Biophysical Aspects

The Biggest Obstacles… DNA Oligonucleotide Secondary and Tertiary Structure Formation Unwanted Hybridization

DNA Oligonucleotide Sequences are Chemically Active, Tend to Assume Thermodynamically Most Stable Form!

DNA Sequences can Bind to Partially Complementary Sequences as Well!

Page 13: Coding For DNA Computing: Combinatorial and Biophysical Aspects

DNA/RNA Secondary and Tertiary Structure

Secondary Structure Pseudoknots (Tertiary Structure)

Mneimneh, 2003 (Figures from Web Lecture Notes)

Page 14: Coding For DNA Computing: Combinatorial and Biophysical Aspects

DNA Hairpins

DNA/RNA Hairpin Structure Participate in Important Biological Functions:

• Regulation of Gene Expression (Zazopoulos, et. al., 1997);

• DNA Recombination (Froelich-Ammon, et. al., 1994);

• Facilitation of Mutagenic Events (Trinh and Sinden, 1993): in Living Cell, after Breaking of Intermolecular Pairing in Double Helix DNA, Loose Strands Form a DNA Hairpin;

• Potential Antisense Drug (Tang, et. al., 1993): Injecting into a Living Cell Hairpin with Nucleic Acid Bases Complementary to an mRNA of a Disease Gene Blocks its Expression

Page 15: Coding For DNA Computing: Combinatorial and Biophysical Aspects

DNA/RNA Knots

RNA Secondary Structure Influences Function of RNA: Knots are Special “Regulators”Figures: Haslinger, 2001; Craven, 2001

Page 16: Coding For DNA Computing: Combinatorial and Biophysical Aspects

Mathematical FormulationDefinition 1 (Hasliner, 2001): A Secondary Structure S is a Vertex-Labeled Graph on n Vertices, for which the Adjacency Matrix A has the following properties

An Edge (i,j), |i-j|>1 is Called a Base-Pairing.A Secondary Structure Can Consist of the Following Structural Elements:

jlithenjkiandaaIf

athatsuchiikonemostatisthereieachFor

nia

nia

lkji

ki

ii

ii

1.311,1.2

1,1

1,1.1

,,

,

1,

1,

1. A Stack Consists of Subsequent Base Pairs (p-k,q+k),

(p-k+1,q+k-1),…,(p,q); k is the Length of the Stack

2. A Loop Consists of all Unpaired Vertices which are Immediately Interior to some Terminal Base Pair

3. An External Vertex is an Unpaired Vertex which does not Belong to a Loop

Page 17: Coding For DNA Computing: Combinatorial and Biophysical Aspects

If Definition 1, Part 3 is Violated for a Base Pairing, then the Resulting Formation is Referred to as a Pseudoknot

Mathematical Formulation

With Information about Energy of Pairings and Additional Measurements Regarding the DNA Backbone, Determining Stable Secondary Structures Becomes a Purely Combinatorial Problem

Secondary Structure Prediction: Dynamical Programming Approach,

Polynomial Time Nussinov’s and Zuckermann Algorithm

Pseudoknots: NP-Complete, Except for Special Class of H-Knots (Rivas, Eddy 2003)

Page 18: Coding For DNA Computing: Combinatorial and Biophysical Aspects

Nussinov’s Folding Algorithm

WCCWCCnot

,1),(,0),(

Free Energy of Secondary Structure S : ZTSHTEESE avSfree log][][ ,

jiE , Free Energy of Secondary Structure Limited to positions i, i+1,…, j

scomplementWC

CGTAccSequence n

,,0),(0),(

},,,{,),(,...,1

Figure: Mneimneh, 2003, Bundschuh, 2004

Feynman Diagrams for RNA Structure Prediction (Eddy, Rivas 2001)

Free Energy Table:

Sequence CCCAAATGG

jkiEEccE

Ejkki

jijiji ,

),(min

,1,

1,1,

Page 19: Coding For DNA Computing: Combinatorial and Biophysical Aspects

Statistical Physics: DNA Ensemble Analysis

Bundschuh, Hwa 2004

TkccSESEnZ BSji

jinS

/1,),(][,][exp)(),()(

Bundschuh, Hwa 2004: Statistics of Secondary Structures in Ensemble of Long Random DNA Sequences Why? Detection of Important Structural Components in mRNAs, Functional RNAs, Characterization of the Response of Long Oligonucleotide DNA Molecule to Puling Forces Random DNA = Problem of Disordered Systems

Page 20: Coding For DNA Computing: Combinatorial and Biophysical Aspects

Statistical Physics: DNA Ensemble Analysis Molten Phase: Absence of Disorder

2/12/30

00

]4/)21[()(

21)(,2/3

qqqA

qqz

Thermodynamic Ensemble: Large Number of Different Secondary Structures with Equal EnergyStability of Molten Phase: Use N-Replica Method

Page 21: Coding For DNA Computing: Combinatorial and Biophysical Aspects

Glassy Phase: Few Low Energy Configurations in Thermodynamic Limit

Stat Physics DNA Ensemble Analysis

Droplet Theory (Huse and Fisher): ‘‘Large-Scale Low-Energy Excitations’’ About Ground State• Impose deformation over a length scale L>>1, Monitor Minimal Free Energy Cost of Deformation;• Cost Expected to Scale as Lw for large L: Positive w Indicates Deformation Cost Grows with Increasing Size. Negative w Indicates Deformation Cost Decays: there is a Large Number of Configurations with Low Overlap with Ground State, whose Energies are Similar to the Ground State Energy in the Thermodynamic Limit (Zero-Temperature Behavior not Stable to Thermal Fluctuations - No Thermodynamic Glass Phase can Exist at any Finite Temperature

Related Analysis: A. Pagnani, G. Parisi, and F. Ricci-Tersenghi, 2000/2001

Page 22: Coding For DNA Computing: Combinatorial and Biophysical Aspects

The Stability of a Particular Secondary Structure is a Function of Several Constraints: 1) Number of GC versus AT /GT Base Pairs(Larger Number of Hydrogen Bonds Form more Stable Structures) 2) Number of Base Pairs Forming a Stem Region(Presence of Long Subsequence and its Reverse Complement Lead to Stabilization ) 3) Number of Base Pairs in a Hairpin (More than 15 or less than 4-7 Bases put “Stress” on the Loop )4) Number of Unpaired Bases(More Unpaired Bases lead to less Stable Structure )

Page 23: Coding For DNA Computing: Combinatorial and Biophysical Aspects

Hybridization Constraints Individual Sequence Constraints (Wood, Tsaftaaris etc):

IP1) The consecutive-bases constraint. Long Runs of the Same Base Forbidden. IP2) The constant GC-content constraint. Introduced to Achieve Parallelized Operations on DNA Sequences; Assures Similar Thermodynamic (Melting Temperature) Characteristics of all Codewords. GC-Content Usually in the Range of 30-50% of Code Length; Joint Sequence Constraints:JP1) The Hamming distance constraint. Limits Unwanted Hybridizations between Codewords. Requirement is that all Distinct Pairs of Codewords p,q in C be at Hamming Distance at Least dmin. To Limit Undesired Hybridization between a Codeword and the Reverse-Complement of any other Codeword (including itself) the Reverse Complement Hamming Distance has to be at Least dRCminJP2) The frame-shift constraint. Applies Only to Limited Number of Problems. Refers to Requirement that Concatenation of Two or More Codewords should not Properly Contain Another Codeword.JP3) The forbidden subsequence constraint. Specifies that a Class of Substrings Must not Occur in any Codeword or Concatenation of Codewords

Page 24: Coding For DNA Computing: Combinatorial and Biophysical Aspects

Code Construction

Approach I: Binary Mapping Approach II: Extended, Cyclic Goppa Codes over GF(4) Approach III: Hadamard Matrices with Cyclic Core

WHY Cyclic? Will Show that Computational Complexity for Nussinov’s Algorithm Significantly Reduced in this Case

PRIOR WORK:Addressed 1/2/3 Requirements; No Families of Codes Given (Length Limited to 20);No Attempt Whatsoever to Consider Secondary Structure Constraints;References: Condon et.al. 2000-2004; King 2003; Ryakov 2003; Gaborit and King 2004; Ghrayeb et.al. 2004;

Page 25: Coding For DNA Computing: Combinatorial and Biophysical Aspects

Terminology

},,,{,,...,...

21

21

CGTAQpqpppqqq

iin

n

pq

11

11

...

...

qqq

qqq

nn

nn

RC

R

q

q

}},{{#)(_ CGpContentGCw i p

DNA Code C : Set of Codewords over Alphabet Q;Minimum Hamming, Reverse and Reverse-Complement Hamming Distance:

Constant GC Content Code:

qp,dmind

qp,dmind

qp,dmind

qp,dmind

WCWC

RCH

RC

RH

R

H

qpCqp

Cqp

Cqp

qpCqp

,

,

,

,

wContentGCC )(_: pp

|}:{|),( iiWC qpi qpd

Page 26: Coding For DNA Computing: Combinatorial and Biophysical Aspects

Binary Mapping Approach11,10,01,00 GCTA

Co

Cebofesubsequencoddobofesubsequencevene

ofimagebinaryb

qqOqqE

q)qq)q

qq

:)(:)(

()(()(

)( Example: q=ACGTCC

b(q)=001011011010

e(q)=011011

o(q)=001100

Code D: [n,k,d],

Contains All-Ones Word

Construction:

DNA Code: Number of Codewords Length 2n Hamming, Reverse Complement Hamming Distance

at Least d

Page 27: Coding For DNA Computing: Combinatorial and Biophysical Aspects

Longest Length Codes…

)1(22/21

22/22/)2/(

2

n

ddHnddHA

dn

d)2(2

2ddn

Bounds on

2/)2/(2

dn

d

A

(Based on Bounds by Ashikhmin et al, 2005)

Binary Mapping: Subcodes of Simplex Codes (All-Zero Not Allowed) -- EVEN

Special Subset of Codewords from Menas/Zettenberg Codes --ODD

iii

iii

GG

GGGG

01...110...0

,110

01...110...0

,110101

11

12

Page 28: Coding For DNA Computing: Combinatorial and Biophysical Aspects

Extended Cyclic Goppa CodesCcCc R

),( Rcc

211 ini

Approach:• Take a Family of Reversible ( ) Cyclic Codes • Eliminate all Self-Reversible Codewords• From Each Remaining Pair Retain Exactly One Codeword• Complement Second Half of Each Codeword

Let for q a Power of a Prime and Let g(z) be a Polynomial of Degree over such that g(z) has no Root in . The Goppa Code, , consists of all words such that

),(},...,,{ 21m

n qGF ., nm n),( mqGF

)( ),(),,...,,( 21 qGFcccc in

n

i i

i zgzc

1

)(mod0

azzzg ))(()( 21

is a code of length n, dimension and minimum distance .)( mnk 1d

Zhang et. al., 1988

Page 29: Coding For DNA Computing: Combinatorial and Biophysical Aspects

DNA Codes and Goppa Codes 2/kq

12)1(22

222

122

122

4421,14

4421,14

mamam

mamam

mm

mm

Mn

Mn

Choose Constant GC Content Subset of Codewords

Example:

A Reversible Cyclic Code of Dimension k over GF(q) contains self-reversible

Codewords.

CGTTC,CAAAT,CTCCA,GCCTT,GGAGA,ACTAA

For arbitrary positive integers a,m, there exist DNA Codes D such that

12)(

22)(

aDd

aDdRCH

H

having the following properties

Page 30: Coding For DNA Computing: Combinatorial and Biophysical Aspects

Complex (Generalized) Hadamard Matrices

nIHH *

1,...,0,/2 mleC milm

),( mCnHH Matrix of Dimension n×n over Set of m-th Roots of Unity

With property

Exponent Matrix: over

Choose p=3, and Use only One of G/C

),( pZnE }1,...,2,1,0{ pZ p

For any , there exists DNA codes D with codewords of length , with constant GC-content equal to and

Each Codeword of such a Code is a Cyclic Shift of a Fixed Generator Codeword g.

Zk 13 kM 13 kn13 k

132)( kH Dd 13)( kRC

H Dd

Theorem[Heng et.al, 02] Let N=pk-1 for p Prime and a Positive Integer k. Let g(x)=c0+c1x+c2x2+…+cN-kxN-k be a Monic Polynomial over Zp, of Degree N-k, such that g(x)h(x)=xN-1 over Zp , for some monic irreducible polynomial h(x) in Zp[x] . Suppose that the vector , (0,c0,c1,c2,…,cN-k) with ci=0 for N-k<i<N has the property that it contains each element of Zp the same number of times. Then the N cyclic shifts of the vector (c0,c1,c2,…,cN-k) form the code of the exponent matrix of some Hadamard matrix H(pk,Cp)

Page 31: Coding For DNA Computing: Combinatorial and Biophysical Aspects

Hadamard and Vienna…

Vienna Package: T=37◦C

http://www.tbi.univie.ac.at/~ivo/RNA/

Based on Nussinov’s Algorithm

Gives one Minimum Free Energy Secondary Structure

MFOLD (Zuckerman et.al.2000)

Page 32: Coding For DNA Computing: Combinatorial and Biophysical Aspects

Why Cyclic Codes?),...,,( 21 ncccLet a DNA Code Consist of the Cyclic Shifts of a Codeword .

Provided that the free energy table of is known, the free-energy tables of all other codewords can be computed with a total of O(n3) operations only. More precisely, the free-energy table of the codeword can be obtained from the table in O(n2) steps.

),...,,( 11 nn ccc

Page 33: Coding For DNA Computing: Combinatorial and Biophysical Aspects

C C C A A A T G G

C 0 0 0 0 0 0 -1 -2 -3

C 0 0 0 0 0 0 -1 -2 -2

C 0 0 0 0 0 0 -1 -2 -2

A 0 0 0 0 0 0 -1 -1 -1

A 0 0 0 0 0 0 -1 -1 -1

A 0 0 0 0 0 0 -1 -1 -1

T 0 0 0 0 0 0 0 0 0

G 0 0 0 0 0 0 0 0 0

G 0 0 0 0 0 0 0 0 0

G A C A A A G G T

G 0 0 -1 -1 -1 -1 -1 -1 -2

A 0 0 0 0 0 0 -1 -1 -2

C 0 0 0 0 0 0 -1 -1 -1

A 0 0 0 0 0 0 0 0 -1

A 0 0 0 0 0 0 0 0 -1

A 0 0 0 0 0 0 0 0 -1

G 0 0 0 0 0 0 0 0 0

G 0 0 0 0 0 0 0 0 0

T 0 0 0 0 0 0 0 0 0

d WC(CCCAAATGG,GCCCAAATG)=7 d WC(GACAAAGGT,TGACAAAGG)=9

d WC(CCCAAATGG,GGCCCAAAT)=6 d WC(GACAAAGGT,GTGACAAAG)=7

T1: Free Energy: -0.24Kcal/mol T2: -0.19Kcal/mol

Energies Obtained from Vienna RNA Folding Package (I. Hofacker)

Page 34: Coding For DNA Computing: Combinatorial and Biophysical Aspects

Why Binary Mapping?1 1 1 0 0 0 0 1 1

1 0 -1 -1 -1 -2 -2 -3 -4 -41 0 0 -1 -1 -2 -2 -3 -3 -4

1 0 0 0 0 -1 -1 -2 -3 -30 0 0 0 0 -1 -1 -2 -2 -3

0 0 0 0 0 0 -1 -1 -1 -2

0 0 0 0 0 0 0 -1 -1 -20 0 0 0 0 0 0 0 0 -1

1 0 0 0 0 0 0 0 0 -11 0 0 0 0 0 0 0 0 0

C

C GC G

A

A A

T

Page 35: Coding For DNA Computing: Combinatorial and Biophysical Aspects

1 0 1 0 1 0 1 1 0

1 0 0 -1 -1 -2 -2 -3 -3 -4

0 0 0 0 -1 -1 -2 -2 -3 -3

1 0 0 0 0 -1 -1 -2 -2 -3

0 0 0 0 0 0 -1 -1 -2 -2

1 0 0 0 0 0 0 -1 -1 -1

0 0 0 0 0 0 0 0 -1 -2

1 0 0 0 0 0 0 0 -1 -1

1 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0

6)001100101,110010100(,6)011001010,110010100(

WC

WC

dd

1 1 0 0 1 0 1 0 0

1 0 -1 -1 -2 -2 -2 -3 -3 -41 0 0 0 -1 -2 -2 -2 -3 -3

0 0 0 0 -1 -1 -1 -2 -2 -30 0 0 0 0 0 -1 -1 -2 -2

1 0 0 0 0 0 0 -1 -1 -2

0 0 0 0 0 0 0 0 -1 -11 0 0 0 0 0 0 0 0 -1

0 0 0 0 0 0 0 0 0 -10 0 0 0 0 0 0 0 0 0

2)101010101,101010110(,8)010101011,101010110(

WC

WC

dd

What Type of Sequences do Minimize the entry E1,n?Cyclic Shifts with a Minimized Set {i: WC(Ci)=Ci+k,

k=1,2,…,m}

Page 36: Coding For DNA Computing: Combinatorial and Biophysical Aspects

The Cyclic Distance (Binary Case) Known: Peng, 1998

34,2/)1(24,2/)2(14,2/)1(

4,2/

)(

),(min)(

),...,,(),,...,,(

11

2121

knnknnknnknn

Sd

SSdSd

sssSsssS

cyc

iHnicyc

ininini

n

Achieved: Maximum Length Shift Register (MLSR) Sequences

(Pseudo-Random Sequences in General)

Sequence Weight: w =n/2, n even

w =(n-1)/2, n odd

What are the Reversal Distance Properties of MLSR Sequences?

Page 37: Coding For DNA Computing: Combinatorial and Biophysical Aspects

Watson-Crick Distance: Plotkin-Type of Bound

n

i

iC

iG

iT

iAWC

WCWC

WC

xxxxMM

MM

1

22)1(),(

)1(),(

),(

u v

u v

u v

vud

dvud

vud

The Watson-Crick Distance

0,0:

iC

iG

iT

iA

iC

iG

iT

iA

xxxxMAX

iColumnofcontentCx

iColumnofcontentGx

iColumnofcontentTx

iColumnofcontentAx

)(2/,2

2 classicalndnd

dM

nd

Page 38: Coding For DNA Computing: Combinatorial and Biophysical Aspects

The Free Energy of a DNA Strand (c1,c2,…,cn) can be Approximated According to Breslauer’s Formula

n

iiifree uucorrectionE

11),(

Much more Accurate:

weights

uuuuuucorrectionE

i

n

imiim

n

iii

n

iiifree

11

221

11 ),(...),(),(

Page 39: Coding For DNA Computing: Combinatorial and Biophysical Aspects

Other Coding Problems

Generalized deBruijn Sequences Association Schemes for Hamming/RC Hamming/Constant

GC Content Binary Mapping Approach with Runlength Constraints Forbidden Pattern Constraints (Enumeration Techniques

by Goulden and Jackson…) Catalan Numbers: b=1: CN(1)=1   ( )

b=2: CN(2)=2   ( ) ( ), ( ( ) )b=3: CN(3)=5   ( ) ( ) ( ), ( ( ) ( ) ), ( ( ) ) ( ), ( ) ( ( ) ), ( ( ( ) ) )

mm

mmCN

21

1)(