Thermodynamics of Error and Error Correction in Brownian Tape Copying C.H. Bennett and M. Donkor IBM...

23
Thermodynamics of Error and Error Correction in Brownian Tape Copying C.H. Bennett and M. Donkor IBM Research Yorktown 17 April 2008

Transcript of Thermodynamics of Error and Error Correction in Brownian Tape Copying C.H. Bennett and M. Donkor IBM...

Thermodynamics of Error and Error Correction in Brownian Tape Copying

C.H. Bennett and M. Donkor

IBM Research Yorktown

17 April 2008

For any given hardware environment, e.g. CMOS, DNA polymerase, there will be some tradeoff among dissipation, error, and computation rate.

More complicated hardware might reduce the error, and/or increase the amount of computation done per unit energy dissipated.

This tradeoff is largely unexplored, except by chemical engineers.

Practical Fractional Stills

Most studies of DNA and RNA polymerases tend to focus on mechanisms for recognition of the site and strand of the DNA to be copied, mechanisms for binding, initiation, and termination of copying, etc. rather than the copying process itself.

(cf WEHI animation: see separate mov file)

Here we study the thermodynamics of some simple Brownian copying engines modeled on polymerases, but meant to elucidate the speed-error-dissipation tradeoff at a fundamental level.

The chaotic world of Brownian motion, illustrated by a molecular dynamics movie of a synthetic lipid bilayer (middle) in water (left and right)

dilauryl phosphatidyl ethanolamine in waterhttp://www.pc.chemie.tu-darmstadt.de/research/molcad/movie.shtml

RNA polymerase reaction viewed as a one-dimensional driven random walk

activation free energy

driving forceor as thermal diffusion on a washboard potential

Tilting the washboard the other way (e.g. by increasing the PP concentration)makes the driving force negative, resulting in reversible erasure or un-copying of an already synthesized strand of RNA.

driving force

But, because the RNA strand separates from the DNA original after leaving the copying enzyme, error transitions have the same driving force as good transitions.

Higher activation barrier for error transition

Copying errors therefore act as reversible obstructions,difficult to insert, and equally difficult to remove.

C

E

CC

CE

EC

EE

CCC

CCE

CEC

CEE

ECC

ECE

EEC

EEE

Here C denotes a correct base (in a 2-letter alphabet for simplicity) and E an error base.

Error transitions (those that add or remove errors) are less frequent.

Copying with errors may be viewed as a random walk on a binary tree…

Forward driving force = kT ln p/(1-p)

Activation barrier difference =kT ln (1-s)/s

Random walk parameterized by:p = forward step probability, and s = selection factor against errors

Steady State Solution: errors are uniformly and randomly distributed in the copy tape with some frequency. Error rate and the drift velocity v are calculated as a function of random walk parameters p and s.

Solve self-consistently for drift velocity v and rates of error incorporation and removal, so the net rate of incorporation equals the drift velocity times the steady state error frequency .

v = ps – (1–p)s net error incorporation rate equation v = p – (1–p) ((1–s)(1–) + s ) drift velocity equation

v and asfunctions of pwith s=0.1

drift velocity v

error rate

forward step probability p

For any s>0,Steady state has

v = 0 and= 1/2

when p = 1/3

make unmake error

step forward

---------- step back---------

+0.1

Uncopying regime v<0. Uncopying rate depends on tape’s initial error concentration, being slower the more errors originally present.

Paradoxical regime where v>0 despite negative driving force.

Copying regime v>0. After a transient period of copying or uncopying, copying rate becomes independent of initial tape contents

v=0

Forward step probability p

Copying rate v

Typical random walk trajectories showing how errors impede uncopying

Without any possibility of errors: s=0Forward step probability p=1/3Trajectory drifts leftward

With errors: s=0.1 Step-right probability p=1/3Leftward drift gets stalled by errors (red)

Time

When intrinsic error rate sis low but nonzero, the random walk trajectory distribution is skewed, with a long tail on the left.

Time

1000 walks of4000 steps eachs = 0.01p = 0.40

For p between 1/3 and 1/2, the drift is initially negative (uncopying), but changes to positive after enough errorsaccumulate.

First 720 steps All 4000 steps, shrunk vertically

Skewness and initial leftward drift disappear if walks are initialized with an appropriate concentration of errors.

Left comet is for walks with no initial errors.

Right comet has initial error concentration 0.34, from steady state model

s=0.01, p=0.4N=1000, L=4000 as before

Dissipation (entropy production per step) is a sum of 2 terms:

• External entropy due to work done by the external driving force

Dext = v ln (p/(1-p)) / |v|

Dext can be negative if v and ln(p/(1-p)) have opposite sign.

• Internal Shannon entropy of incorporated errors (has same sign as v))

Dint = v/|v| (- ln() -(1-)ln(1-))

Dissipation per step = Dext + Dint is nonnegative.

0 1 2-1-2

1

10-2

10-4

10-6-ln 2

uncopying speed -v copying

speed v

copying speed v

dissipation per step

dissipationper step

incorporated error rate

driving force = ln(p/(1-p))

0 = 10-4

Proofreading in DNA Replication

Polymerase activity (1) tries to insert correct base, but occasionally (2) makes an error. Exonuclease activity (3) tries to remove errors, but occasion-ally (4) removes correct bases. When both reactions are driven hard forward the error rate is the product of their individual error rates.

Dissipation mainly in external driving reactions

Dissipation mainly in incorporated errors. At high error rate, this pushes process forward even against uphill external driving force

(slower and/or more dissipative)

Discrimination s

f

C

E

CE

EC

CC

EE

Non-proofreading kinetics is degenerate, reflecting locality of error processes in computation: an error is only uncomfortable while it is being made. Thereafter it is neither favored nor unfavored compared to a correct digit. This is why errors are difficult to remove in a simple Brownian copying process.

C

E

CE

EC

CC

EE

Proofreading scheme by contrast behaves kinetically like an energetically nondegenerate case of simple Brownian copying, in which errors are permanently uncomfortable, being hard to make but easy to unmake, as if they had an intrinsic energy cost even after incorporation. (This would be the case if the original and copy strands stuck together after copying.)

But energetically there is a difference. An energetically nondegenerate scheme could maintain an error probability < ½ without dissipation at zero drift, but the proofreading scheme requires continuing dissipation, because of the cycling action of polymerase and exonuclease undoing each others’ work.

ExonucleasePolymerase

C’

E’

C

E

CC’

CE’

EC’

EE’

CE

EC

CC

EE

One stage proofreading scheme can reduce the error to as low as s2, in the limit of high dissipation.

A 2-stage generalization might intersperse a provisional state (yellow) between each stage of copying. A multi-stage generalization would intersperse a chain of N-1 provisional states, like afractional still or isotope en-richment cascade.

We are studying the Speed/Error/Dissipation tradeoff for Brownian copying by efficient schemes of this sort.

Another open problem is how to generalize this analysis to more general kinds of computation than simple tape copying.