Opinion

4
The Opmzon column offers mathematicians the opportumty to wrzte about any zssue of interest to the international mathematical community. Disagreement and controversy are welcome. An Opmlon should be submztted to the re- views e&tor, Chandler Davis. The Information in Your Hand Jack Cohen and Ian Stewart Hand as Information. Your hand is designed according to certain instructions coded up in your DNA. The length of these instructions gives a measure of the amount of reforma- tion in your hand. Rudy Rucker, Mind Tools Each age interprets its universe in terms of what is currently important to it. When ancient animistic Man wanted to make sense of the starry sky, she saw it as a zoo of people and animals--the Hunter, the Swan, the Lion, the Dog. The Mechanical Age of the 18th Century bred a mechanistic philosophy, the clockwork universe, with God as the watchmaker who set the wheels spinning and then stood back to watch his cre- ation turn. Our present Computer Age sees the uni- verse as an ever-changing flow of information. If we were to discover the stars today, our first instinct would be to try to decode their message. So when, in the Computer Age, Crick and Watson stumbled across the double helix of DNA and its aper- iodic sequence of nucleotides, it was inevitable that DNA would be seen as a "program" or "code" that contained the "genetic information" needed to make you and me. Indeed, it was a major breakthrough, perhaps the major breakthrough, of this century to de- cipher the "genetic code" whereby triples of nucleo- tides specify protein structure. From such a viewpoint DNA is a genetic message transmitted from parent to offspring, a list of instructions, like a glorified knitting pattern. And, just as we can look at a knitting pattern and see which part of it governs the design of the neckline or the armhole, we imagine that if only we were clever enough we could look at the DNA pattern and see which part of it governs the design of a neck, or an arm. Or a hand. And of course, if we want to produce a very compli- 12 THE MATHEMATICAL INTELLIGENCER VOL 13, NO 3 9 1991 Spnnger-Verlag New York

Transcript of Opinion

Page 1: Opinion

The Opmzon column offers mathematicians the opportumty to wrzte about any zssue of interest to the international mathematical community. Disagreement and controversy are welcome. An Opmlon should be submztted to the re- views e&tor, Chandler Davis.

The Information in Your Hand

Jack Cohen and Ian Stewart

Hand as I n f o r m a t i o n . Your hand is designed according to certain instructions coded up in your DNA. The length of these instructions gives a measure of the amount of reforma- tion in your hand.

Rudy Rucker, Mind Tools

Each age interprets its universe in terms of what is currently important to it. When ancient animistic Man wanted to make sense of the starry sky, she saw it as a zoo of people and animals- - the Hunter, the Swan, the Lion, the Dog. The Mechanical Age of the 18th Century bred a mechanistic philosophy, the clockwork universe, with God as the watchmaker who set the wheels spinning and then stood back to watch his cre- ation turn. Our present Computer Age sees the uni- verse as an ever-changing flow of information. If we were to discover the stars today, our first instinct would be to try to decode their message.

So when, in the Computer Age, Crick and Watson

stumbled across the double helix of DNA and its aper- iodic sequence of nucleotides, it was inevitable that DNA would be seen as a "program" or "code" that contained the "genetic information" needed to make you and me. Indeed, it was a major breakthrough, perhaps the major breakthrough, of this century to de- cipher the "genetic code" whereby triples of nucleo- tides specify protein structure. From such a viewpoint DNA is a genetic message transmitted from parent to offspring, a list of instructions, like a glorified knitting pattern. And, just as we can look at a knitting pattern and see which part of it governs the design of the neckline or the armhole, we imagine that if only we were clever enough we could look at the DNA pattern and see which part of it governs the design of a neck, or an arm.

Or a hand. And of course, if we want to produce a very compli-

12 THE MATHEMATICAL INTELLIGENCER VOL 13, NO 3 �9 1991 Spnnger-Verlag New York

Page 2: Opinion

cated sweater, say one with an intricate lacy three-di- mensional effect looking like butterflies on a back- ground of bullrushes, the more information the knit- ting pattern must provide. So the longer the DNA sequence is, the more complicated must be the part of the organisms that it contains the instructions for, and the more information that part must "contain."

It's a picture of DNA as the Book of Life. You can imagine thumbing the pages of the genetic handbook, looking for the Sentence that produces hemoglobin, the Paragraph that produces a blood cell, the Chapter that produces an a r te ry- -even the Appendix that pro- duces an appendix. The Book of Life image is often explicit in the sales pitch for the self-proclaimed Great Project of sequencing the human genome. It is the world-view of the SF writer Tom Easton's "gengineer" stories, in which you can tear out the pouch-page from the Book of the Kangaroo and glue it into the Book of the Albatross to get air mail. Above all it's a picture of information as data-string: the longer the sequence of instructions, the more information it contains.

So, of course, because there are more letters in "quadruped" than in "dog," the message "Fido is a

jquadruped" contains more information than "Fido is a dog."

And since the DNA of a mammal contains fewer nu- c leot ides than that of an amphib ian , and some amoebae contain a hundred times as much DNA as either, it follows that mammals are pretty simple crea- tures, really, and amoebae are amazingly complex in comparison.

Once you start thinking like that, you begin to rea- lise that DNA-as-message must be a flawed metaphor.

Convergence

The idea of DNA as genetic information sits uneasily wi th the p h e n o m e n o n of convergence. Different "causes" can produce the same "effect." Flight, for ex- ample, has risen at least four times in the history of evolution: in pterosaurs, insects, birds, and bats. The wing is a common structure in the world of living creatures. But these creatures do not possess some common DNA sequence that produces wings.

There is also a great deal of convergence within a single species. For example, chemical changes are highly dependent upon temperature. Frogs develop from tadpoles in ponds whose temperatures vary from perhaps 5~ to 25~ within the course of one day. Many of the genetic instructions in frog DNA buffer the frog biochemistry against temperature changes. This leaves a great deal less "information" to deter- mine the basic developmental program around which the buffering routines fit. One is left with the uncom- fortable feeling that an adult frog is far too complicated an object to be produced by the amount of information that we know exists in its DNA.

Information

The idea of information as a quantity was invented by Claude Shannon around 1930, and it arose from engi- neering problems in telecommunications. Information theory models the following situation: a message (rep- resented as a string of binary digits 0 and 1) is to be sent from a transmitter to a recezver. In the simplest set- up each digit is considered equally likely, and thus conveys exactly one bit of information. In this case, a message of n binary digits contains n bits of informa- tion, so here the longer the message, the more infor- mation it contains.

The primary concerns of classical information theory are twofold: noise in the communication channel, and coding of information. Noise degrades the signal and reduces the informat ion-carrying capacity of the channel. Encoding the message at the transmitter and decoding it at the receiver is a mathematical device to protect against degradation by noise: it can also incor- porate situations in which the probabilities of indi- v idual message c o m p o n e n t s are non-uniform, or where there is extra structure or redundancy in the original message. The unequal probabilities of dif- ferent components of the message are transformed by the coding procedure into (possibly) equal probabili- ties of O's and l 's in the actual signal sent.

In consequence, when applying a simple bit-count to deduce the quantity of information contained in some message, it is important to take the context, the assumptions that lie behind the encoding/decoding method, into account. Only in the ideal situation does every possible message occur with equal probability; only then does each binary digit carry one bit of infor- mation. In the "transmission" of genetic "informa- tion" from parent to offspring, the context is currently largely unknown. It could make an awful difference!

Thought Experiments

Here are a number of thought experiments. Their aim is not to cast doubt upon the validity of classical infor- mation theory, for each has an interpretation within that scheme in which it makes perfectly good sense. The object is to make it clear that naive bit-counts, not taking context into account, can generate nonsense; and to show that the aspects of a message that matter to human beings, such as meaning, understanding, and development, do not fit readily into the informa- t ion theore t ic mould . We are not c laiming that Shannon ever intended them to; but we feel that the distinction be tween bi t-count and meaning is not always appreciated.

1. "If I don't phone you tonight, Auntie Gertie will be arriving on the 4:10 train from Chattanooga. Take her home."

2. "You'll find what you want on pages 75-94 of

THE MATHEMATICAL INTELLIGENCER VOL 13, NO 3, 1991 13

Page 3: Opinion

volume 77 of the Bulletin of the American Mathemat- ical Society."

3. "IRPNY." 4. On a t e l ev i s ion screen , the c ap t i on "Cal l

0800-666-777777 to make a donation."

Experiment 1, on the face of it, conveys a sizeable quanti ty of information with a zero-bit m e s s a g e - - though since the alternative is that you do phone, it's really a one-bit message. An enormously complicated sequence of events is set in train by the absence of a telephone call: get out address-book to find Auntie's address so that you can think about traffic-patterns and work out the best route to take her home while you're on the way to the station; put on coat, open front door, go through, shut it again, get keys from pocket, open car door, get in, shut door, start car, en- gage gear, let out clutch, avoid neighbour 's cat in driveway, turn left on to street . . . .

In experiment 2 a message of 105 characters, say 525 bits in less-than-optimal code, triggers access to the entire information in twenty pages of a technical journal, say around 80,000-100,000 bits.

In experiment 3, a masterpiece of the advertiser's art, a simple but direct message is conveyed in a mere four characters.

In experiment 4, a message of thirteen decimal digits, or around 43 bits, is received. However, the en- gineers that designed the format of television signals know that the actual amount of information consumed by the appropriate segment of the TV screen is far h ighe r - - a round 100 lines, each of 1000 individual phosphor dots in three colours must be activated: say 800,000 bits. To transmit the telephone number by television, you have to send an 800,000-bit message! It won' t work otherwise.

In each case it is reasonably clear what mechanism is operating, where to locate the sleight-of-hand that turns comfortable communication-theorist's informa- tion from a conserved quantity into something so mal- leable that there is no point in measuring it. Indeed, each can be v iewed as an exercise in coding, "'trig- gering" the access of information from a specific range of possibilities.

Bit-Counts Don't Quantify Meaning

Yes, b u t . . . There is a species of biological theorist--increasing

in numbers - - tha t counts the quantity of information carried by a segment of DNA and deduces limits on the complexity of the resulting animal. Does this make sense?

One could count the quantity of information trans- mitted in examples 1 -3 and deduce limits on the com- plexity of the resulting actions. Those limits would be gross underestimates. For example 4, on the other hand, it wou ld be a gross overestimate. Mere bit-

counting ignores the context in which the "message" is sent. It bears no relation to the true "information utility" of the message; that is, the complexity of the action that it initiates. In more familiar terms, how much meaning it possesses.

If the manner by which DNA code is transformed into creatures is ignored, we have no idea whatever of the possible complexity of the creature that results from a given segment of DNA. It takes very few bits to send "make a tiger"; and a receiver that understood such a message (i.e., "knew" what it had to do to im- plement it) would need nothing more, apart from ap- propriate context, to construct the world's most beau- tiful feline. Such receivers do exist, namely zoo cu- r a t o r s - a n d the context is equally straightforward, a pair of tigers.

On the other hand, twenty million DNA codons might do no more than convey the colour of the an- imal's ear-tufts, if some TV-like system were in use. (Oddly, it isn't even necessary to code for the colour of ear-tufts p r o v i d e d - - a s is u s u a l - - y o u want them darker than the rest of the animal. The chemical that determines the colour is temperature-dependent, and ear-tufts are cool because they are at an extremity. It is trivial to "code for" darker ear-tufts, just as the absence of a telephone call "codes for" meeting Auntie G.)

Prescmptzon, rather than description, is closer to the mark; not just for DNA "messages," but for any mes- sage outside of the abstract setting of information theory, which deliberately strips out the context. A prescription from the doctor is not of itself a cure: it only turns into one when taken to a drugstore, "re- ceived" by a pharmacist, and acted upon. All "mes- sages" in the real world that really are messages happen within a context. That context may be evolu- tionary, chemical, biological, neurological, linguistic, or technological, but it transforms the question of in- formation content beyond measure.

The accurate definition of the information content of a message relies upon the context. When speaking of technology, we generally k n o w - - a t least in principle - - w h a t the contextual contribution is. We don't nor- mally try to play a compact disc on a telephone an- swering-machine. But when thinking about the nat- ural world, we often forget that we do not know how much contextual input there is into processes that we like to model as "message-sending."

Many biologists talk of developmental processes being "switched on" by genes; for example, that he- reditary disease A is "caused by" defect B in gene C, or that gene D "codes for" structure E. In this style of thinking the DNA sequence is a computer program, and the organism appears when you "run" this pro- gram. Genetic engineering is analogous to computer hacking. So ingrained has this type of picture become, that many biologists act as if "it's programmed in the

14 THE MATHEMATICAL INTELLIGENCER VOL 13, NO 3, 1991

Page 4: Opinion

DNA" answers everything. It's not that the "pro- gram" image is completely false; just that it's only part of the process. What about the "'computer"? How does that work? How does the "information" in the DNA "program" lead to a fully developed organism? What else is needed? These are important questions, to which the "bit-count" measure of information is at best marginally relevant.

What Message?

The metaphor of DNA as a "message" from parent to offspring doesn't hold up under scrutiny. When the "message" is transmitted, there is no recewer. The mes- sage, indeed, is supposed to describe how to construct the receiver! Strictly speaking, the "genetic code" isn't even a code. It is true that DNA "codes for," that is, determines, p ro te ins - -bu t there is no converse pro- cess of encoding proteins into DNA. DNA is not "transmitted," but copzed (subject to the complications of sexual reproduction); and the process whereby DNA "becomes" offspring also revolves the parent.

Our obsession with information technology and ,messages as bit-strings has led us to focus almost ex- clusively on DNA as "software" and to ignore the con- textual "hardware" (or "wetware") in which it pro- duces actions. Moreover, there are other things than DNA that also pass from parent to offspring, things that on a biochemical level are comparable to DNA but which we don't think of as coding anything, and there- fore fail to think of as conveying information. In most sexually reproducing cellular animals the egg begins development without involving the embryo's own genes. Only when the "ground plan" is sufficiently well developed in the embryo structure do the em- bryo's own genes take control. Mammals take the whole process much further: they put a great deal more into the mother, thereby simplifying what has to be put into the embryo's DNA. We have already men- tioned that a large part of frog DNA deals with alter- native enzyme pathways for different temperature levels. In contrast, in a mammal the uterine tempera- ture is kept constant by the mother's own regulatory systems; so mammals don't need to put that kind of "information" into their DNA. This is why mammal DNA contains fewer nucleotides than amphibian DNA, while managing to produce animals that are manifestly "more complex." We might now speculate about a super-mammal that puts the available "extra" DNA to good use . . . .

Humans take the whole process one stage further. Much of what we need in order to be human is genu- inely "transmitted" to us as a message- -no t geneti- cally, but through our brains. Language is an example. If language were "hard-wired" into our genes (as- suming this to be possible), then we wouldn ' t have to learn it; but it would presumably use up a lot of DNA.

Instead, our DNA seems just to code for language- learning ability within a brain that has already evolved for other reasons; then the language itself is trans- mitted culturally, from the mother and other adults to the child. This method is far from foolproof-- the lan- guage that we learn is an imperfect rendition of what is taught to u s - - b u t it gains in efficiency and flexi- bility. In Richard Dawkins's terminology, we not only pass on our genes to our offspring, but our memes (self- perpetuating mental structures) as well.

Why, then, do we focus so obsessively on the DNA sequence? Because it looks like a message, a code, a piece of software. That metaphor has borne considerable fruit, but it can also be a snare and a delusion.

Our complexity is not determined by the number of nucleotides in our DNA sequence: it is determined by the complexity of the actions that can be initiated by those nucleotides within the overall system that con- stitutes not just us, but out parents, other ancestors, and indeed our cultural heritage. Much of that com- plexity, in our case, is built into the overall system: it is not coded into our DNA. The development of a hand, for example, is part of the culmination of a series of processes that produces our skeleton, our muscles, our skin, and so on; each stage being dependent upon the current state of others, and all of them dependent upon contextual physical, biological, chemical, and cultural processes for which no "information" as such is required.

A part of DNA can no more code for a hand than a part of the scaffolding for a building under construc- tion can hover unsupported in midair. Your hand con- tains flesh, blood, and b o n e - - b u t no information.

Jack Cohen The Lodge 39 Greenhlll Blackwell Bromsgrove B60 1BL United Kingdom

Ian Stewart Mathematzcs Institute Universtty of Warwzck Coventry CV4 7AL United Kingdom

THE MATHEMATICAL INTELLIGENCER VOL 13, NO 3, 1991 1 5