Download - Allport - 1985 - Distributed Memory, Modular Subsystems and Dysphasia

Page 1: Allport - 1985 - Distributed Memory, Modular Subsystems and Dysphasia

2 D. A. Allport

Distributed memory, modular subsystems and dysphasia


I take it as self-evident that the dysphasias-acquired disorders of language-are a class of memory disorder. Of course, this is not to say that they are, primarily, impairments of 'episodic' memory, that is, of memory for particular experiences or events; but they are impairments, nonetheless, of memory, or memory-retrieval, for the previously familiar patterns of language. Dysphasic memory impairments are seen, for example, in the difficulty of retrieving the spoken form of a word, given some specification of its meaning; or in retrieving a meaning, given the spoken form; in recovering the orthographic (written) form of a word, given its spoken form; and so on.l Let us call the ability that is needed for such tasks and which is evidently disturbed in dysphasic impairments 'language memory', to distinguish it both from the more general (non-episodic and non-linguistic) knowledge of the world-and from memory for particular, experienced (non-linguistic) events or episodes, both of which may be well preserved in many forms of dysphasia (Allport, 1983a).

If this is granted, that the dysphasias represent a class of memory disorders, it must be equally evident that we shall need a theory of memory retrieval and memory interference-a theory of the nature and origin of confused or incomplete or inaccurate retrieval-as an essential tool in the understanding of dysphasia. In spite of this, there have been surprisingly few attempts to apply such theoretical understanding as we have of the psychology of memory to the phe­nomena of dysphasia.

Certain recent developments in the fundamental conception of memory processes, and of their possible embodiment in physical structures like the brain, now make this a more promising enter­~rise than it has appeared hitherto. The key developments here are m models of 'distributed' memory and of parallel-associative pro­cesses of retrieval (Hinton & Anderson, 1981). My aim in this chap-



ter is to introduce these theoretical ideas, so far as is possible in a non-technical way, and to consider some of their implications for our understanding of the nature of dysphasic difficulties. Before doing so, however, it will be worthwhile, by way of contrast, to consider the currently dominant approach in cognitive neuropsy­chology to the understanding of language and language disorders, namely the identification of isolable 'processing components' (modular subsystems), and to review, briefly, the strengths and limitations of this approach (Section I). In Section II I outline various different levels of explanation, as applied to neuropsycho­logical data. I shall then be in a position to introduce, in Section III, some of the essential ideas of distributed memory in a way that, I hope, may make them intuitively accessible to the non-mathemat­ical reader. Finally, in Section IV, I consider how these ideas apply to aspects of brain-injured, dysphasic performance and to what are, or are not, valid neuropsychological 'components'.


Despite the immense and rapidly growing quantity of information available on the anatomy and physiology of the brain, we still know almost nothing about the processes in the nervous system respon­sible for language or other higher-level cognitive abilities. The traditional aphasiological approach-the correlation of behavioural deficit with anatomical lesion site-has yielded somewhat slender dividends in terms of insights into the disordered processes. Mean­while, independent of the neurosciences, the psychological inves­tigation of normal cognitive abilities has developed in a number of important ways. In the 'information-processing' approach to cognitive psychology, or 'cognitive science' (Norman, 1981), a key idea has been that the mechanisms of behaviour can be described at an abstract, or process level, without any reference to the physical or biological hardware involved, much as a computer program can be written without explicit reference to the physical machine on which it will run. In this tradition, information-processing models of cognitive processes are often expressed in flow-chart form, that is as blueprints for a set of computable processes, where the long­term goal is the complete specification of these processes in a work­ing computer program. (In the psychology of language, McClelland & Rumelhart's (1981) model of written word recognition provides an elegant, representative example of this kind of theory-building.)

Page 2: Allport - 1985 - Distributed Memory, Modular Subsystems and Dysphasia


If the flow-chart model is seen as a step towards formulating a fully-specified, computable model, a still more preliminary step is to try to identify the in-principle-separable components-the build­ing blocks-of the system as a whole, components which may then be studied and modelled in an intelligible way, at least partly in isolation from the rest of the system. There is increasing support for the view that the staggering complexity of human behaviour is the product of interaction among many different, semi-independent subsystems each performing a unique, specialist role in the overall organization (e.g. Allport, 1977, 1980; Minsky, 1979; Fodor, 1983). An analogy is often drawn with a 'society of experts', a large organ­ization-the CIA provides a favourite example-in which differ­ent units of the organization have different skills and are in possession of different pieces of information; in which no single member of the organization can possibly possess all the knowledge, or all the expertise, contained in the organization as a whole. The analogy between individual minds, and societies, has a number of interesting features. For the present, the essential idea is that the functional components of mind are, in general, special-purpose rather than general-purpose elements in the working of the whole system (Allport, 1980). If this general view is correct, then to char­acterize in outline any of these separable subsystems-to discover broadly what it does and with what other subsystems it communicates­becomes an essential preliminary to constructing a detailed, infor­mation-processing model of how that subsystem computes the specialized functions that have been ascribed to it.

In the light of this preliminary but essential goal, the recent surge of interest among cognitive psychologists in the phenomena of dysphasia, and other behavioural consequences of brain injury, is easily understood. Cognitive psychologists have come to recognize the potential of the individual, neuropsychological case-study to reveal dissociable behavioural deficits, and hence to provide clues about the functional separability of the underlying component mechanisms (e.g. Marin et aI, 1976; Patterson, 1981; Shallice, 1979). Equally ambitiously, it is hoped, selective impairment of some components may permit a uniquely privileged view of the working of the remaining, intact systems.

The most convincing defence of this research strategy is its ability to produce consistent interpretations of dysphasic performance, converging on the same functional components as can be inferred from experiments using normal subjects. Much recent research, particularly that concerned with the processing of written language,


can claim to provide successful illustrations (e.g. Coltheart et aI, 1980; Patterson & Coltheart, 1984). A particularly influential exam­ple, to which we shall need to return, is Morton's logogen n;t0del, a model of normal lexical organization which has been apphed to several varieties of dysphasic and dyslexic performance (e.g. Morton, 1980; Ellis, 1982).

I share the enthusiasm and excitement over the 'modular subsys­tems' approach. At the same time, it is important to recognize certain potential limitations in its application to brain-injured patients.

The strategy rests on two rather strong assumptions. The first is that biological information-processing systems-human minds­are indeed highly modular in organization, in the way suggested, not only in their abstract or 'functional' organization but also, and equivalently, in their anatomical embodiment, so that localized anatomical lesions can selectively damage just one or a small number of psychologically intelligible subsystems, leaving other subsystems physically unimpaired. The second assumption is. that, in this case, the ensuing behaviour reflects the normal operation of the remaining, intact subsystems, minus the contribution ~f ~he damaged components, without major compensating reorganIzation on the part of the surviving components. This latter assumption, in particular, appears threatened by the evident fact that, following cerebral injury, at least some recovery of language, as of other cognitive abilities, is almost always observed (cf. Newcombe and Ratcliff, 1979; Finger and Stein, 1982). Indeed, this is surely the aspect of dysphasia of principal interest to therapists, and to the patients themselves. Yet contemporary analyses of language mech­anisms at the level of 'separable functional components' (the box-and­arrow notation of current cognitive neuropsychology) appear to have nothing that they can say about it.

Functional components and cerebral lesions

There are, however, more serious problems at stake, all of which reflect a mismatch with the level of description needed to accom­modate the manifestations of brain injury in dysphasia. First, and obviously threatening to this approach, Wood (1978, 1982) ~as shown how in a distributed memory system, a clear 'double dls­sociation' between behavioural deficits can be consistent with complete overlap in the underlying representations. Of this subject, however, more later. The inferred separable components (the

Page 3: Allport - 1985 - Distributed Memory, Modular Subsystems and Dysphasia


'boxes' and 'arrows' of current information-processing models in neuropsychology) are highly abstract entities, whose psychological validity and interpretation, it is claimed (e.g. Morton, 1981), is independent of any possible, physical implementation in the brain. Applied to the behaviour of intact, normal subjects, this approach seems reasonable enough. Directed towards the understanding of the effects of neurological injury, it appears less obviously satisfac­tory. What, specifically, might it mean to think of'lesioning' a compo­nent in such an abstract, disembodied system?

It is, perhaps, straightforward enough to think of the simple dele­tion of an entire component, a 'box' or an 'arrow', from such an information-processing model. This option, however, lays itself open to the objection raised by Freud (1891) against the earlier 'diagram makers', to the effect that theorizing in this form seems to reflect dysphasic performance as though seen in silhouette, with­out internal structure. For this level of analysis, the 'ideal' dysphasic data should take the form of complete failure on one set of tasks, normal (intact) performance on another set. In practice, such complete functional dissociations are seldom, if ever, seen. On the other hand, what is seen every day in the dysphasic clinic is reduced efficiency of performance in one or more domains: slower and less reliable word-finding; partial or incomplete retrieval of word­meanings; increased confusability between similar items or similar constructions; and so on and so on. How these all-too-familiar phenomena of diminished, but not zero, performance within any one processing domain are to be explained by theories at the level of Independent Processing Components is far from obvious.

Of course, this is not to deny that focal head injury may result in selective impairment of particular domains of language processing. The point at issue is that, within anyone domain, the impairment is, most commonly, partial-a general reduction of efficiency-not all­or-none.

Another feature of the reduced efficiency of dysphasic perform­ance deserves comment here. When the same tests designed to probe receptive or expressive lexical knowledge are repeated over a period, success on individual words typically fluctuates from one occasion of testing to another, even though the overall test scores may remain remarkably consistent. That is, particular classes of words can be differentially affected, as a group, in a consistent way. ~o~~ver, what appears not to occur is the permanent loss of unique, mdivldual written or spoken wordforms, leaving others in the same class intact. The same applies to memory for other recurrent


patterns, such as faces or melodies. Whereas the recognition of previously familiar faces can be impaired in general, what has never been reported is an acquired, selective inability to recognize (say) one's grandmother, while recognition of one's grandfather is preserved. Brain lesions may have selective effects at the level of whole processing components, but not, it appears, at the level of individual words or objects in memory.

Clearly, none of the features of dysphasic performance that I have mentioned show the analysis of psychological mechanisms in terms of distinct, but interacting, subsystems to be in any sense wrong. Far from it. They are, nonetheless, examples of very obvious and general phenomena that theories, at that level of abstraction, simply fail to engage at all. As I suggested earlier, if we are to get some theoretical insight into them, we shall need to look for theories at a different logical level of description.


David Marr (1981) has put forward a theoretical framework for our understanding of the processes of vision, a field in which infor­mation-processing analysis is a long way ahead of the corresponding research in language. In presenting this framework, and the prog­ress within it so far achieved, Marr illustrates the point again and again that, if one hopes to understand any complex information­processing system, one will need different kinds of explanation at several different levels of description, levels which may be, at first, only very loosely linked. Some properties of a system's behaviour will be most appropriately explained at one level, some at another. To make matters harder, it is by no means always obvious in advance which level of explanation will be the most appropriate, or tractable, for any given, behavioural phenomenon.

Marr & Nishihara (1978) distinguished four levels of description. To begin with, there is the analysis of basic components and their local circuitry: how do transistors and diodes (neurons and synapses) work? At another level -up, are questions about implementation: how are assemblies of the basic components arranged to implement particular mechanisms-the adders and multipliers of a pocket calculator, for example? Most importantly, for our present purpose, at this level arise questions about how the fundamental mechanisms of memory-storage, comparison, retrieval-are implemented.

The third level is that of representation and algorithm, the level of description at which most current work in artificial intelligence, and

Page 4: Allport - 1985 - Distributed Memory, Modular Subsystems and Dysphasia


much of cognitive psychology is aimed. Here the central questions are: (a) what aspects of the information being handled by the system are given explicit2 internal representation, so that they can be used directly by a given process; (b) at what 'stage' in the system, i.e. from which other representations, can they be obtained; and (c) how (i.e. by what computable procedures) are they derived? Cognitive psychologists have tended to concern themselves more with the first two of these questions, with identifying distinct or common (shared) codes of representation, and with mapping their channels of intercommunication, than with specifying computable proce­dures for transforming one code into another. Examples of ques­tions of types (a) and (b) in the cognitive psychology of language would include, for example, many currently live issues about the organization of the mental lexicon. (For instance, in the perception or production of speech, is there any level of explicit representation of systematic phonemes? In written word-recognition, is there a stage of representation of abstract letter-identities? If so, what other stages or subsystems can read from (have inputs from) this particu­lar code? Are there distinct lexical and non-lexical coding systems by which a skilled reader can derive pronunciation from print? Etc., etc.)

Finally, the top level of description contains the abstract theory of the computation or process being performed, that is, the theory in the broadest sense of what is being done, and why; and what are the constraints provided by the world in which it operates that make it possible? As regards language, the level of 'computational theory' perhaps corresponds most nearly to that of abstract the­oretical linguistics.

In terms of these four levels of analysis it is not immediately obvious, to which level the cognitive neuropsychologists' 'function­ally separable components', inferred from dissociable behavioural deficits, should be assigned. Arguably, the most global of these component distinctions, such as, for example, the distinction between 'logogen system' in general and 'cognitive system', belong properly to the level of the abstract 'computational' (linguistic?) theory. Similarly, linguistic intuitions regarding the broad decom­position of the language faculty into syntactic, semantic, phono­logical (etc) domains, and which have claimed support from the major categories of dysphasic impairment (e.g. Caramazza & Berndt, 1978; Lesser, 1978), belong at this level. There is a parallel here with Marr's use of neuropsychological dissociations in the perception of three-dimensional objects (Taylor & Warrington,


1973) as the basis for certain fundamental choices, at the 'compu­tational theory' level, about the overall organization of the visual process (Marr, 1981).

Equally, it can be argued that the neuropsychologists' separable functional components correspond-or at least ought to corre­spond-one for one with distinct representation systems, i.e. distinct attribute codes (e.g. Allport, 1980; Monsell, 1983). Hence they belong to the next level of analysis, the level of 'representation and algorithm'.

At either of these levels of analysis, however, we find little help in understanding what it might mean to 'lesion' -to injure rather than to eliminate-one of these abstract components. Where physi­cal injury results not in the total abolition of some function (or representational ability) but in a reduction of its scope and effi­ciency-for example in diminished vocabulary, slower, unreliable and errorful retrieval, etc.-then the box-and-arrow notation of current functional-component models (e.g. Morton, Ch. 9) offers no obvious way to accommodate these changes.

To understand these behavioural effects we need also to have a model of functionally separable components at the (neural) imple­mentation level. The principal aim of this paper is to motivate, and to provide at least an introduction to such a model.

Even to suggest such an enterprise evokes responses of dismay, even of abrupt dismissal, on the part of many cognitive psychol­ogists. Clearly there is a yawning theoretical gulf here. On one side of the gap there is a vigorous, even flourishing cognitive psychol­ogy, applied to both normal and pathological language processes, operating almost exclusively at Marr's third level (symbolic repre­sentations). On the other side of the gap there are dramatic and continuing advances in the neurosciences-, almost entirely at the 'basic components' level. Between these two, however, questions at what Marr called the implementation level appear to have been very largely ignored by those on either side3 •

In spite of this, if we are to press our question To which level of description does the analysis of modular sub-systems or 'sepa­rable functional components' properly belong? the correct answer appears to be: All levels, down to and including that of fundamental mechanism or 'implementation'. That is, the way in which psycho­logical processes emerge from interactions among modular subsys­tems has strong implications for, and is in turn illuminated by, analysis at each of Marr's three levels of description-the top level, computational (linguistic) theory, the level of symbolic represen-

Page 5: Allport - 1985 - Distributed Memory, Modular Subsystems and Dysphasia


tation and process, and at the level of physical implementation. Indeed, it is because questions about the modular decomposition of the mind/brain arise at each of these levels, and because their solu­tions have important implications for other questions at each level, that the identification of modular subsystems represents such a primary and essential goal for the sciences of cognition-theoretical and computational linguistics, cognitive psychology, neuropsychol­ogy-and for the understanding of language pathology.


Semantic nets and neural nets

A widely accepted notation for representing the structure of lexical and semantic knowledge, adopted both in psychology and in computer science, takes the form of a 'semantic net' -a network of concept nodes and labelled, directed relational links (e.g. Collins & Loftus, 1975). Most discussions of semantic nets are confined to the abstract level of 'representation and algorithm', without refer­ence to their possible embodiments in (neuronal) hardware. Iri neuropsychology and the study of dysphasia, however, if we are to understand the disorders of lexical and semantic memory that result from physical injury we shall undoubtedly need an explicit theory of the relationship between the abstract representation level and the level of its physical implementation.

One obvious possibility is to suppose that different concept nodes in a semantic network correspond to different physical elements in the hardware (neurons, cell-assemblies, etc.), and that the relational links between concepts (and between concepts and word-forms) similarly correspond to particular physical linkages. Evidently, this is a possibility that many people take quite seriously, both in neurophysiology (e.g. Barlow, 1972) and, mutatis mutandis, in arti­ficial intelligence (e.g. Fahlman, 1979). However, it is not the only one. Another possibility is that each 'concept node' corresponds not to a distinct part, or component, of the hardware but to a particular pattern of activity in it. Different concept nodes, in this sort of ~plementation, can be represented by different patterns of activity m the same set of physical units, the same network. That is, the representation of concepts is 'distributed'.

Distributed memory

To get a better idea of what this might mean, consider the following


simplified example, for which I am indebted-as throughout this section-to Hinton (1981). Imagine a network of simple hardware elements (switches) and their physical interconnections, as illus­trated in Figure Each element has two possible activity states, either 'on' or 'off, which can be represented symbolically by a 1 or a O. Figure 2.1b shows a sequence of activity states of five hard­ware elements, resulting from an initial input and mutual interac­tions within the network.

As I have already emphasized, the same physical system can be described at more than one level of analysis. Thus, the behaviour of our hypothetical network could be described either in terms of the activities of the individual hardware elements (as in Figure 1.1 b) or, alternatively, at a higher level, in terms of the activity of the network as a whole. That is, recurring patterns of activity across all five elements now become the units of analysis, the basic descrip­tive elements to which particular names could be assigned. In this way, Figure 2.1c depicts the sequential relationships between patterns of activity of the hardware elements in Figure 2.1a. It is important to see that, while the diagrams in Figures 2.1a and 2.k are superficially similar, their interpretation is radically different. In Figure 2.1a the nodes represent physically distinct parts of the machine; arrows represent individual hardware connections; and many different nodes can be 'on' at the same time, In Figure 2.1c none of these things are true. Here the nodes stand for mutually exclusive states of the network as a whole; the arrows represent possible transitions between these states.

Diagrams 2.1a and 2.1c both describe the same physical system, but in diagram 2.1c the descriptive elements stand for distributed states; there is no simple one-to-one correspondence between these elements and particular physical parts of the network.

This illustration of Hinton's makes a good starting-point for understanding the idea of 'distributed' memory. In that example, however, a whole lot of important questions were (temporarily) sidestepped. To begin with, one might ask, in what sense should any partiCUlar pattern of activity in the network be treated as a 'unit', rather than the merely accidental co-occurence of activities among its constituent hardware elements? In Figure 2.1c, the arrows assert something about the sequential constraints among particular activity states, i.e. they refer to the (past or future) history of the network. 'Units', in this notation, are thus activity-patterns that stably recur in the system's history. If the network is to act as a memory, we want it to distinguish patterns or events that are

Page 6: Allport - 1985 - Distributed Memory, Modular Subsystems and Dysphasia


time ->

wi 00 00 00 00 <::> <::> <::>

DI~ 00 00 00 <::> 00

ul = 00 = C> = 00

001 00 00 00 00

<I ~ 00 00 00 00


already familiar (Le. that stably recur), and thus can have potential significance, from unknown or arbitrary configurations. Our intitial question thus gives rise to two, more specific questions: 1. How might such a network be arranged so that each of a number

of different activity-patterns can be stably reinstated at different times?

2. How might new, reinstatable activity-patterns (new 'units') be learnt?

To begin to answer these questions, imagine now a network of hardware elements, in which every element is connected to every other, including itself, as in Figure 2.2a. Assume also that each element can be active in a graded amount, rather than simply 'on' or 'off'. Each interconnection transmits excitation (inhibition) from one element to another, with a given positive (negative) weighting, or 'strength' of transmission. The same weightings can be shown also in the form of a matric of interconnections, as in Figure 2.2b. (Naturally for any psychologically plausible application to human memory we shall need to think about a matrix of many more than just four elements.)

Most of the suggestions about learning within such a matrix of interconnected active elements are variants of an idea put forward originally by Hebb (1949). The idea is that the strength of connec­tivity between any two elements (neurons) changes as a function of the amount of concurrent ('pre- and post-synaptic') activity in that pair of elements. For example in Anderson's (1977) matrix memory model, the basic learning assumption is that the weightings of each interconnection are changed in proportion to the product of the

receiving units A B ( 0

A aa ab ac ad Source

B ba bb be bd units

( ca cb cc cd

(a) 0 da db de dd


Fig. 2.2 (a) A completely interconnected network of physical elements. (b) The same system shown as a matrix of interconnections. Each interconnection may have a different variable weighting.

Page 7: Allport - 1985 - Distributed Memory, Modular Subsystems and Dysphasia


activity level in each of the corresponding pairs of source and receiving units. If the inputs to such a system cause the same pattern of activity to occur repeatedly, the set of active elements constituting that pattern will become increasingly strongly inter­associated. That is, each element will tend to turn on every other element in the inter-associated pattern and (with negative weights) to turn off the elements that do not form a part of the pattern. To put it another way, the pattern as a whole will become 'auto­associated' -it will come to cause itself as its own successor. It thus becomes a (one of a set of) stable states of the system. We may call a learned (auto-associated) pattern an 'engram'.

The establishment of an auto-associated pattern will have a number of interesting consequences.

1. Stability

Once evoked, a learned pattern-but not an unlearned one-will tend to maintain itself.

2. Part-to-whole retrieval

The activation of only some elements of the learned pattern will tend to evoke each of the remaining elements of that pattern, since all of its missing elements receive positive connections from each of the elements already present, while currently active elements that are not part of the learned pattern are inhibited. As more of the missing elements are activated, they also begin to assist the recruit­ment of the remainder of the auto-associated pattern, until the network settles into the completed pattern. Some dramatic illustra­tions of this auto-associative forcing of missing pattern-parts are given by Kohonen (1977; Kohonen et aI, 1981).

3. Retrieval dynamics

The process of reinstatement of the complete learned pattern is thus extended over time. Where the input is related, in some degree, to several different engrams (see below), the network will take longer to 'settle' into one, stable pattern of activity. Ratcliff (1978) has put forward a mathematical model of memory retrieval dynamics that is formally equivalent, in several important respects, to that of Anderson (1977), and which provides an impressive fit to a range of experimental data on memory retrieval times.


4. Categorical perception and 'capture'

Input patterns that are similar to (i.e. that share many elements with) a strongly auto-associated pattern, but which are not them­selves already-learned patterns, or are less strongly learned, will tend to recruit the more strongly learned pattern and thus be replaced by it. That is, they get 'captured' by the stronger pattern. With some quite reasonable assumptions about feedback within such a system, and a maximum and minimum (zero) activity level in individual elements, it can be shown that such systems will tend to settle into stable, learned activity-patterns in which some units are maximally active while the remainder are not responding at all (Anderson, 1977). In effect, that is, such systems will tend to exhibit a strong form of 'categorical perception'.

5. Many engrams

Suppose, now, that the input forces a different activity-pattern in the same population of interconnected elements. If this pattern recurs, or is sustained, it too will come to be auto-associated. However, the-at first sight-really surprising feature of matrix memories of this kind is that the learning of this new pattern need not disturb the memory for (i.e. the recoverability of) the previ­ously learned pattern, even though both patterns are stored in the same matrix of interconnections. So long as the different patterns are orthogonal-that is, so long as they are not correlated with one another-then many different patterns (engrams) can be literally superimposed on the same matrix of interconnected elements, with­out mutual inteference.4 The requirement for interference-free recovery of stored patterns, that the different patterns should be uncorrelated, is intuitively obvious when it is appreciated that the process of retrieval of any stored pattern is essentially a process of correlating a given input-vector (a 'retrieval cue') against the matrix as a wholes. To the extent that the retrieval pattern correlates with-overlaps with, resembles-more than one engram that has been stored in the matrix, retrieval will inevitably be distorted by, or suffer 'interference' from these other, related patterns.

The same principles apply to associations between activity­patterns in different sets of hardware elements. Imagine that the group of elements, <x, in Figure 2.3, is completely connected to a second group of elements, ~: every element in the first group is connected to every element in the second group. Suppose further,

Page 8: Allport - 1985 - Distributed Memory, Modular Subsystems and Dysphasia






o U T P U T

Fig. 2.3 Two groups of physical elements, a: and ~, representing two different domains of attributes (after Anderson, 1977). Every element in a: projects to every element in ~. Every element in ~ receives an input from every element in a:.

that whenever the activity-pattern A is excited by inputs to (x, other inputs ('forcing' inputs) excite the activity-pattern B in the second set of elements, {3. (For a discussion of the role of 'forcing stimuli' in associative learning see Kohonen et aI, 1981.) As before, our assumption is that the strength of each interconnection is changed as a function of the product of the activity in each interconnected pair of elements in (X and {3, respectively. Now, after this learning has occurred, whenever pattern A recurs in the elements of (x,

pattern B will be reinstated in the elements of {3. Again, many different associations between different activity-patterns in (X and {3 can be stored within the same matrix of interconnections; and again, of course, the same limitations will be observed due to inter­ference from similar, or related, patterns and their stored associ­ations. The effective 'strength' or recoverability of an engram will be a joint function of (1) the strength of auto-association among the elements of the stored pattern, and (2) the strength of association between the retrieval cue and the to-be-recovered engram, relative to its overlapping associations with all other stored engrams-in other words its distinctiveness or 'uniqueness' as a retrieval cue (cf. Cermak & Craik, 1979).

As Anderson and others have frequently pointed out, memory in this kind of system is, formally as well as intuitively, a form of

f !


tunable filter, responding only to learned ('tuned') input patterns. The response of the system to a novel input-pattern, i.e. one completely unrelated (orthogonal) to any previously stored pattern, will be damped to zero. Similarly, an orthogonal activity-pattern in the elements of (X that has not been associated with an activity­pattern in another set of elements, {3, will give rise to zero activity in {3. That is, a novel input will not be 'seen' by higher levels of the system until it is learned. This must have the result, as Ander­son points out, that such a system will be agonizingly difficult to teach. Once some learned patterns are established, however, further associative learning can be increasingly rapid, the larger-or more multi-dimensional-are the already learned patterns involved. For the same reasons, initial biases in the network will have a profound influence on later learning (cf. Edelman, 1981).

Some properties of distributed, matrix memories

The foregoing is intended to give an entirely informal and intuitive introduction to the basic ideas of distributed representation and matrix memory systems. The theory of distributed representation has been developed over the past dozen years or so by a number of people, notably, with application to the psychol~gy ~f memory, by James Anderson and his colleagues at Brown Uruverslty (Ander­son, 1973, 1977), and by Kohonen in Helsinki (Kohonen, 1~77; Kohonen et aI, 1981). Hinton & Anderson (1981) have compiled an outstanding collection of papers on distributed memory models by many different authors including themselves, and the interested reader is very strongly advised to consult this collection for a fuller and, of course, more technical introduction. It should be empha­sized that the presentation here has been kept to some of the ~ost basic, qualitative features of parallel, distribute~ represent~t~~n. Almost nothing has been said about the computauonal capab1l1ues of such parallel, network systems, which have in fact begun to be used extensively in modelling the complex processes of human vision (Ballard et al, 1983). Their application to theories of higher mental function is also being actively explored (Fahlman et aI, 1983).

Even with the very informal account to which we have confined ourselves here, however, a number of the important properties of such distributed memory systems should be apparent.

First, 'retrieval' is not a matter of fetching information from some storage location and transferring or copying it into another

Page 9: Allport - 1985 - Distributed Memory, Modular Subsystems and Dysphasia


location where it can be 'read', as it is in almost all other kinds of conventional memory-systems, from current general-purpose computers to libraries. Retrieval, in a distributed, matrix memory consists in the re-activation of a specific activity pattern in a specific-i.e. code-specific or content-specific-subset of elements. The activation of that pattern, in that set of elements, can give rise in turn to the activation of an associated pattern, in a different set of elements, and so on. The essential character of the information­processing that occurs in such a system thus consists in 'mapping' or transcoding patterns of activity from one set of elements to another. Radically unlike other kinds of memory, however, there is no distinction between the 'processor' that operates on the avail­able information and the 'store' in which it is held. The memory is not a passive container, in which, in principle, any information­content can be placed, but an active, content-specific pattern-recog­nizer and pattern-transcoder.

Second, among any set of engrams or learned patterns that have been superimposed on the same population of hardware elements, only one can be fully retrieved-re-activated-at a time. That is to say, within anyone set (or 'domain') of pattern-feature elements, a distributed memory system must be 'single-channel' in operation. In order that one pattern can be fully realized, other, potentially competing patterns on the same set of elements must be suppressed.

Third, learning is automatically generalized to new input patterns in proportion to their resemblance (correlation) to patterns already learned, a property of the very greatest importance in dealing with a world in which events seldom, if ever, recur exactly as before; a property, also, that appears to be omnipresent in biological memories, and whose absence is perhaps the single most severe limitation on the use or recovery of information from large-scale conventional (man-made) memory systems. (The latter, in default of true similarity-based content-addressing abilities, must fall back on elaborate methods of indexing and searching to locate the desired information, such as the pattern-matching and back-tracking procedures of list processing languages, that have formed the indispensable apparatus of contemporary Artificial Intelligence.) In distributed, matrix memories the 'interference' resulting from similarity among stored patterns is the price that is paid for the enormous advantage of automatic transfer of learning to similar but novel configurations.


Fourth, matrix memory systems automatically respond to the common elements, or prototypes, from a set of related, learned instances where the 'prototype' is the pattern having the highest correlati~n with (sharing the largest number of microfeatures with) the entire set of instances, even though the prototype pattern itself was never previously encountered-a property that is evidently possessed by biological, human memory (e.g. Posner & Keele, 1970' Sol so & McCarthy, 1981). To put the same point in a slightly , . , different way, matrix memory systems extract 'semantlc memory-in the sense of the long-term-invariant or common features and relationships of many encoded events and their associations-as a? automatic by-product of the encoding of particular, related, 'epl­sodic' instances. However, there is no explicit encoding of these common features and relations distinct from the encoding of each particular instance. 'Episodic' and 'se~antic: me~ory (Tulving, 1983) are thus not separate 'components ~f mmd: It ~ho~d ?ot ~ possible to lose 'semantic' memory while preservmg eplsodiC memory for the same classes of encoded ,events;, thoug~ t~e converse, one-way dissociation-failure to retneve umque, eplsodlc 'context' information-may occur (Kinsbourne & Wood, 1982),


Word forms and word-meanings

Can we now identify, in terms of these ideas about distri~uted, matrix memory systems, what would count as the separable func­tional components' of neuropsychology (Section I). We know from a very wide range of neurophysiological research, (C?wey, 1981; Mountcastle, 1978) that individual neural elements m dlfferent loc~l regions of brain are responsive to differe~t cla~ses of senso,ry at:n­butes: in vision, to colour, movement, onentatlOn, stereo dispanty, and so on; in hearing, to pitch, glide, duration, etc., etc .. , . ~ore­over in some regions individual units are found to be selectively resp~nsive to highly complex configurnations, such as faces (Perrett et aI, 1982). Let us call the class of attri~utes encod~d by each of these sets of specialist elements an attnbute domam. It appears very natural, then, to propose that the neu~opsychologists' separable 'functional components', identified behavwurally through

Page 10: Allport - 1985 - Distributed Memory, Modular Subsystems and Dysphasia


doubly-dissociable deficits in performance, correspond to sets of auto-associated patterns, or engrams, defined over a common population of feature-elements, hence over the same attribute domain. Different attribute domains, different 'components'.

Consider the store of spoken word-forms as one such hypothetical component (Allport & Funnell, 1981). It seems reasonable to assume a representational domain in which the individual elements are responsive to acoustic spectra and to the temporal modulation of sound patterns (Kay, 1982). Possibly, we may need to envisage also a more abstract attribute domain, in which the elements encode phonetic or even phonemic properties, though the evidence does not seem particularly to favour it (Klatt, 1979). A neural dictionary of spoken word-forms might then be realized as a set of auto­associated patterns superimposed on the same acoustic (or phonetic) attribute domain, hence on the same population of feature elements, the same neuronal network. In a system of this kind, individual word-units, therefore, could not be identified with particular sets of neural elements; on the contrary, the entire vocabulary of spoken word-forms would be physically superimposed on the same neural network.

An immediate consequence of this kind of distributed represen­tation is that physical injury should not result in the loss of par­ticular word-forms while others remained unimpaired. Rather we should expect the destruction of any proportion of the neural network underlying the vocabulary of spoken word-forms to have the effect of reducing the discriminability of many or all these learned patterns; the larger the lesion, the greater the effect. Wood (1978, 1981) has constructed a simulation model, based on Ander­son's matrix memory ideas, which exhibits precisely these proper­ties of local 'mass-action': decreasing overall retrieval accuracy as increasing numbers of elements in the matrix are disabled. Further, the failure or inaccuracy of retrieval should be most apparent in respect of those word-forms (or other engrams) that are least strong­ly auto-associated; thus, uncommon words will be more impaired than those that have been encountered often (or also, perhaps, more recently). Moreover, the errors in retrieval will take the form of increased confusability among acoustically (or phonetically) similar word-forms, including the 'capture' of less familiar words by their stronger neighbours within the attribute-space. Finally, since the word-units in such a system exist only as sequential compositions of (acoustic/phonetic) feature-elements, it must follow that degra­dation of information at the word level should always be accom-


panied by loss of discriminability at the sub-lexical feature level. My contention is that all of these properties are precisely to be

found in dysphasic, lexical impairment. _ Slower, less distinctive (errorful) retrieval or recognition of

spoken word-forms. _ In word-finding, incomplete and/or misordered retrieval in the

form of phonemic paraphasias. _ Capture of less familiar word-forms by their acoustic neigh­

bours, both in production, as in so-called 'verbal paraphasias' (malapropisms), and in recognition (Allport, 1983b, c).

_ Impairment of spoken word-forms in perception and production appears to be associated with impaired discrimination of speech sounds at a sub-lexical level (e.g. Allport, 1983c). The discovery of even a single case in which the word-form store was clearly impaired, without any corresponding sub-lexical impairment, would threaten one of the central assumptions put forward here, about lexical (word-form) representation.

_ Finally, what (I maintain) is not observed is the permanent loss of particular spoken word-forms, leaving their acoustic neigh­bours available and unimpaired. Again, the unambiguous demonstration of even one such case would be sufficient to falsify the model. (Note: Wood (1981) has shown how the retrieval of particular engrams may be selectively impaired, even in a fully distributed memory; this can occur if two, nearly iden­tical engrams differ from one another only by a few microfea­tures-all of which have been lesioned-and if no other engrams are critically dependent for their differentiation on the same microfeatures. Clearly this type of effect in no way alters the statement, above, nor its openness to empirical falsification.)

Written word-forms

The immediately preceding discussion has been in terms of a store of spoken word-forms, as one possible example of a neuropsycho­logically dissociable, functional component. A similar case can be made in respect of a store of written word-forms. Allport & Funnell (1981) reviewed a variety of evidence for the independence of these two functional components, which they referred to, respectively, as the phonological and the orthographic lexicon. Each one of the empirical consequences, listed above, of injury to the phonological lexicon, based on our assumptions regarding distributed represen­tation, can be re-stated, mutatis mutandis, as consequences of injury

Page 11: Allport - 1985 - Distributed Memory, Modular Subsystems and Dysphasia


to the orthographic lexicon. Here, of course, the increased confus­ability in retrieval will be in terms of orthographic (letter by letter) similarity. Again, according to the model of distributed represen­tation advocated here, there are no orthographic word-units physically distinct from the representations of (positional) letter­identities of which they are composed. The model, therefore, predicts that impairment of the (receptive-expressive) orthographic lexicon should be invariably accompanied by increased confusabil­ity among (non-lexical) letter identities, and/or, perhaps, letter­positions.

Most importantly, if this model is correct, what should never be observed is the selective, permanent loss of orthographic knowledge regarding any individual written word, while its orthographic neighbours-sharing many of the same letters, in the same (approxi­mate) relative positions-are unimpaired.

Word-meanings and object concepts

I assume that the distributed engrams representing particular word­forms in the phonological and orthographic lexicons are associ­atively linked with other auto-associated patterns representing non linguistic word-meanings ('semantic memory'). For simplicity, let us confine the discussion here to the representation of relatively simple object-concepts. Following the general conception of distrib­uted memory that I have put forward here, I shall further assume that the auto-associated patterns representing physical objects are distributed across a very wide range of attribute domains, encompassing every class of sensory and motor (action-related) attributes pertain­ing to the particular object-concept. The object-concept of tele­phone, for example, must involve the convolution not only of many different complex properties of shape, surface texture, size and so forth that are codable in visual and tactile attribute domains, but also properties specific to auditory and to action-coding domains of representation, including manipulation and speech. Indeed, the full object-concept for teiepmme, as for any other functional artefact, must presumably embody a specification of the complete 'scripted' routine of interactions with the object. The engrams specifying complex action-routines-when and what to pick up, how to hold it, etc., etc.-will no doubt share many of the (auto-associated) sub­patterns, of which they are composed, with a vast number of other learned action-routines that likewise involve grasping, picking up, etc. Similarly in the sensory domains, the same elements involved


in coding, say, the characteristic sounds of one particular object (a telephone) will participate also in many other auto-associated patterns representing other objects or events. Figure 2.4 gives a very rough sketch of the idea here, though the diagram fails to capture the hierarchical nature of object-concepts.

attribute- domains

\\ --

Fig. 2.4 Schematic diagram to illustrate how object concepts might be represented as auto-associated activity patterns (dotted outlines) distributed across many different sensory and motor attribute domains. Spoken and written word­forms are similarly represented as auto-associated patterns within their corresponding ('phonological'/'orthographic') attribute domains. Mappings between word-forms and word-meanings are embodied as distributed matrices of interconnections between attribute domains.

The essential idea is that the same neural elements that are involved in coding the sensory attributes of a (possibly unknown) object presented to eye or hand or ear also make up the elements of the auto-associated activity-patterns that represent familiar object-concepts in 'semantic memory'. This model is, of course, in radical opposition to the view, apparently held by many psychol­ogists, that 'semantic memory' is represented in some abstract, modality-independent, 'conceptual' domain remote from the mech­anisms of perception and of motor organization.

Page 12: Allport - 1985 - Distributed Memory, Modular Subsystems and Dysphasia


Again, if we consider the possible effects of physical injury to such a system, some consequences are immediately apparent: - Since object-concepts are typically distributed over many differ­

ent attribute domains and hence, generally, over widely dispersed brain regions, they will appear to be less vulnerable to local brain injury than (for example) word-forms in the phono­logical lexicon that are defined over only one-or very few­attribute domains. Only very diffuse or widespread injury, as in severe degenerative disease or toxicosis, is liable to result in the clinically evident loss of entire classes of object-concept (e.g. Warrington, 1975).

- Concepts defined over relatively few attribute domains-purely visual objects, for example, such as clouds or colours-will be more vulnerable to local cerebral injury (Gardner, 1973).

- Disorders of object-concepts could result either from degrada­tion of the auto-associative linkages that bond components of the engram together as a unit, or from more local injury to particu­lar attribute domains. In the latter case, the loss of particular attribute information in semantic memory should be accom­panied by a corresponding perceptual (agnosic) deficit.

- It is important to distinguish loss or degradation within the object-representations from the functional disconnection of object-concepts and their corresponding spoken or written word-forms. The associative links between word-forms and object-concepts, embodied as a distributed matrix of intercon­nections, will possess the same properties of graceful degrada­tion (or 'mass action') as the engrams that they link. Effects of this kind, resulting from partial disconnections between attrib­ute domains (and relating to anomie and other dysphasic syndromes), have been demonstrated by Gordon (1981) in a simulation study of distributed representation. Since the associ­ative mappings in each direction are distributed over quite different populations of links 'synapses', the functional discon­nections can of course be unidirectional (cf. Allport & Funnell, 1981).

The theory of distributed, associative memory has many funda­mental implications for our understanding of neuropsychological impairments, only a few of which have been even touched on here. In particular, I have tried to suggest how these ideas may be mapped onto neuropsychological conceptions of separable 'func­tional components'. I conclude, now, with an illustration of how the same theoretical orientation may help to clarify what perhaps


should not be considered as a functionally independent component. Other examples could be given, but one must suffice.

Auditory-verbal short-term memory

I began by claiming that the dysphasias were, self-evidently, a kind of memory disorder. (I hope that, by now, the sense in which this claim was intended is sufficiently clear.) However, in the more familiar sense of 'memory' as recall or recognition of particular events, people, places-that is, episodic memory-there is little to suggest that dysphasic patients have any necessarily accompanying impairments of this kind. With one exception. This is in the immediate, auditory-vocal repetition of spoken sequences: what used to be called 'immediate memory span' (Miller, 1956). It is a commonplace that virtually all dysphasic patients-certainly all those whom it would be appropriate to classify as having impair­ments of the phonological lexicon-have a greatly reduced span of immediate repetition (e.g. Albert, 1976; Heilman et aI, 1976). The majority of healthy adults can repeat back a sequence such as a seven-digit telephone number, without error. For many dysphasic patients the span is of three digits, or less.

Up to the early 1970s at least, it was thought appropriate to attribute the (normal) immediate repetition span, or a large part of it, to the capacity of a central short-term store (STS) that was either (opinions differed) modality independent or in some way specialized for spoken language (Atkinson & Shiffrin, 1971; Crowder, 1976). Thus it was very natural for Shallice and Warrington (1970, 1977; Warrington & Shallice, 1969) to represent STS as a separable 'func­tional component', in the neuropsychological sense, and to suggest that deficits in 'span' could be due to specific impairment of this hypothetical, functional component, which they referred to as 'auditory-verbal short-term memory'. However, there are several reasons why their hypothesis, formulated in this way, may be a mistake.

First, the once-popular distinction between long-term and short­term stores (even attribute-specific stores) as functionally separable components has come under increasingly severe criticism. It is probably fair to say that there is now no really convincing evidence in favour of such a distinction, and much that is contrary (e.g. Hunt & Elliott, 1980; Glenberg & Kraus, 1981; for extensive discussion of this issue see Cermak & Craik, 1979).

Second, as is well known, serial 'span' in normal subjects is

Page 13: Allport - 1985 - Distributed Memory, Modular Subsystems and Dysphasia


massively affected by the acoustic similarity among the words in the sequence. 'Bat, hat, rat, gnat, vat, cat' is harder to remember than 'bat, hood, shrew, midge, jar, dog'. Moreover, in normal as well as in dysphasic subjects, span depends not only on the sounds of the syllables but on their lexical familiarity and their meaningful­ness. Thus, the typical span of seven digits drops to around five common words, and to only about two or three nonsense-syllables; sequences of common words result in a longer span than rare words; names of concrete objects have a longer span than abstract words (Brener, 1940).

All of these behavioural characteristics suggest that a mechanism having the properties of the phonological lexicon is intimately involved in-or, indeed, is the functional component responsible for-repetition span. If this suggestion is correct, it would, of course, follow that all those dysphasic patients who show impair­ment of the phonological lexicon should also exhibit a reduced auditory-verbal repetition span. The available evidence (current­ly quite limited) suggests that they do (Allport, 1983b). If, on the contrary, auditory-verbal short-term memory and the phonological lexicon were functionally independent subsystems, there is no obvious or compelling reason why these impairments should in fact co-occur.

In the model I have put forward here, lesion of the phonological lexicon must result in the reduced distinctiveness of the 'phono­logical' attribute domain. Consequently, for all such patients, spoken word-lists are more acoustically (phonologically) similar. For the same reason, as the dimensionality of the attribute-space is reduced, so will be the strength of all those auto-associated patterns (word-forms) that are defined over it. The effect will be that, for such patients, previously familiar words behave more like uncommon words or even like nonsense-syllables; their stability and recoverability is diminished. In the simplest kind of matrix model of list memory (Murdock, 1979), the signal-to-noise ratio (d') for item information is equal to kin, where k is the number of dimen­sions or 'feature elements' in the attribute domain and n is the number of items in the list. For a given d', the smaller (more severely lesioned) the attribute-space, the fewer the list-items that can be recalled. (Order information will show similar effects. The coding of temporal sequence in distributed memory is briefly discussed by Murdock, 1979, and by Kohonen et al, 1981.)

Traditionally, in psychology, the mechanisms of perception and of memory have been studied in rather separate compartments. In contrast, the way of thinking inspired by the conception of distrib-


uted associative memory strongly discourages any such separation. The case of auditory-verbal short-term memory perhaps provides one example.

Concluding remarks

Few would claim that the currently available systems of classifying dysphasic impairments are wholly adequate; still less that the theoretical framework underlying such classification, and in terms of which these impairments are to be understood-and remedi­ated-is satisfactory; or even that there is a coherent theoretical framework at all. The traditional objective, of assigning dysphasic patients to one or another of a set of mutually exclusive categories, in practice results in most patients being categorized-if one is honest-as 'mixed', a category that is of singularly little use either as regards decisions about, or evaluation of, therapy.

In contrast, the 'modular subsystems' approach adopted increas­ingly by cognitive neuropsychologists, where it has been system­atically applied, has shown the ability to provide not only a coherent descriptive classification of impairments but to offer genuinely new insights into the functional/causal relationships between them (e.g. Patterson & Coltheart, 1984). Where the modular subsystems approach, on its own, fails to provide insight, on the contrary, is in the most commonplace character of dysphasic (and other neuro­psychological) impairment: the so-called 'graceful degradation' of performance, whereby particular linguistic functions are impover­ished, slowed, subject to increased equivocation and error, rather than simple, all-or-none loss of function.

Distributed associative memory provides a potential account of these phenomena, as well as of many other fundamental properties of normal memory retrieval. The claim of this paper is that these two theoretical approaches precisely and necessarily complement each other. We need to work on them both.

Since the potential of these approaches has, as yet, only begun to be exploited, the future, for cognitive neuropsychology, of their combined application is still wide open. The prospect, however, looks encouraging.


1. These examples are all lexical (Allport & Funnell, 1981). Acquired disorders in other aspects of language-syntax, prosody) semantics-can similarly be thought of as disorders of memory retrieval.

Page 14: Allport - 1985 - Distributed Memory, Modular Subsystems and Dysphasia


2. In a representation such as a topographic map, for example, local contours are shown explicitly, whereas (say) the visibility of one point from another is only implicit in the representation: it must be derived by a further process of inference. For further discussion on this and related issues of representation and process, see Marr (1981), Chapter 1; Palmer (1978).

3. With notable exceptions, of course, including, e.g. Edelman & Mountcastle, 1978; Schmitt et ai, 1981; Hinton & Anderson, 1981.

4. The number of completely orthogonal (unrelated) patterns that can be represented within a matrix is, of course, limited by the dimensionality-the number of independent elements-in the matrix. For discussion of capacity limitation in linear and non-linear matrix memories, see Willshaw, 1981.

5. Strictly, in simple matrix-memories, taking the dot-product. For a helpful introduction see Murdock, 1979; elementary matrix algebra required.

6. There may well be impairments in still lower-level auditory (or articulatory) attribute domains that need not be accompanied by impairments at the word-form level.


Albert M L 1976 Short-term memory and aphasia. Brain and Language 3: 28-33 Allport A 1977 What level of detail for cognitive theories? AISB Quarterly 27: 10-13 Allport D A 1980 Panerns and actions: cognitive mechanisms are content-specific. In:

Claxton G (ed) Cognitive psychology; New directions. Routledge & Kegan Paul, London

Allport D A 1983a Language and cognition. In: Harris R (ed) Approaches to language. Pergamon, Oxford

Allport D A 1983b Auditory-verbal short-term memory and conduction aphasia. In: Bouma H, Bouwhuis D (eds) Anention & Performance 10. Erlbaum, Hillsdale NJ.

Allport D A 1983c Speech production and comprehension: one lexicon or two? In: Prinz W & Sanders A F (eds) Cognition and motor processes. Springer, Berlin

Allport D A, Funnell E 1981 Components of the mental lexicon. Philosophical Transactions of the Royal Society (London) B295: 397-410

Anderson J A 1973 A theory for the recognition of items from short memorized lists. Psychological Review 80: 417-438

Anderson J A 1977 Neural models with cognitive implications. In: LaBerge D, Samuels S J (eds) Basic processes in reading. Erlbaum, Hillsdale NJ

Atkinson R C, Shiffrin R M 1971 The control of short-term memory. Scientific American 225: 82-90

Ballard D H, Hinton G E, Sejnowski T J 1983 Parallel visual computation. Nature 306: 21-26

Barlow H 1972 Single units and pqception: a neuron doctrine for perceptual psychology. Perception 1: 371-394

Brener R 1940 An experimental investigation of memory span. Journal of Experimental Psychology 26: 467-482

Caramazza A, Berndt R S 1978 Semantic and syntactic processes in aphasia: a review of the literature. Psychological Bulletin 8S: 898-918

Cermak L S, Craik F I M (eds) 1979 Levels of processing in human memory. Erlbaum, Hillsdale NJ

Collins A M, Loftus E F 1975 A spreading-activation theory of semantic processing. PsYchological Review 82: 407-428

Coltheart M, Patterson K, Marshall J C (eds) 1980 Deep dyslexia. Routledge & Kegan Paul, London

Cowey A 1981 Why are there so many visual areas? In: Schmitt F 0, Worden F G,


Adelman G, Dennis S G (eds) The organization of the cerebral cortex. MIT Press, Cambridge MA

Crowder R G 1976 Principles oflearning and memory. Erlbaum, Hillsdale NJ Edelman G M 1981 Group selection as the basis for higher brain function. In: Schmitt

F 0, Worden F G, Adelman G, Dennis S G (eds) The organization of the cerebral cortex. MIT Press, Cambridge MA

Edelman G M, Mountcastle V B (eds) 1978 The mindful brain: cortical organization and the group-selective theory of higher brain function. MIT Press, Cambridge MA

Ellis A W 1982 Spelling and writing (and reading and speaking). In: Ellis A W (ed) Normality and pathology in cognitive functions. Academic Press, London

Fahlman 1979 NETL: a system for representing and using real-world knowledge. MIT Press, Cambridge MA

Fahlman S E, Hinton G E, Sejnowliki T J 1983 Massively parallel architectures for AI: Netl, Thistle, and Boltzmann machines. Proceedings of the National Conference on Artificial Intelligence, Washington, DC

Finger S, Stein D G 1982 Brain damage and recovery. Academic, New York. Fodor J A 1982 The modularity of mind. Freud S 1891 Zur Auffassung der Aphasien. Deuticke, Vienna Gardner H 1973 The contribution of operativity to naming capacity in aphasic

patients. Neuropsychologia 11: 213-220 Glenberg A M, Kraus T a 1981 Long-term recency is not found on a recognition test.

Journal of Experimental Psychology: Learning, Memory and Cognition 7: 475-479

Gordon B 1982 Confrontation naming: computational model and disconnection simulation. In: Arbib M A, Caplan D, Marshall J C (eds) Neural models of language processes. Academic, New York

Hinton G E 1981 Implementing semantic nets in parallel hardware. In: Hinton G E, Anderson J A (eds) Parallel models of associative memory. Erlbaum, Hillsdale NJ

Hinton G E, Anderson J A (eds) 1981 Parallel models of associative memory. Erlbaum, Hillsdale NJ

Hebb D 01949 The organization of behavior. Wiley, New York Heilman K M, Scholes R, Watson R T 1976 Defects of immediate memory in Broca's

and Conduction aphasia. Brain and Language 3: 201-208 Hunt R R, Elliott J M 1980 The role of nonsemantic information in memory:

orthographic distinctiveness effects on retention. Journa1 of Experimental Psychology: General 109: 49-74

Kay R H 1982 Hearing of modulation in sounds. Physiological Reviews 62: 894-975 Kinsbourne M, Woed F 1982 Theoretical considerations regarding the

episodic-semantic memory distinction. In: Cermak L S (ed) Human memory and amnesia. Erlbaum, Hillsdale NJ

Klatt D H 1980 Speech perception: a model of acoustic-phonetic analysis and lexical access. In: Cole R A (ed) Perception and production of fluent speech. Erlbaum, Hillsdale NJ

Kohonen T 1977 Associative memory-A system-theoretical approach. Springer, Berlin

Kohonen T, Lehtio P, Oja E 1981 Storage and processing of information in distributed associative memory systems. In: Hinton G E, Anderson J A (eds) Parallel models of associative memory. Erlbaum, Hillsdale NJ

Lesser R 1978 Linguistic investigations of aphasia. Arnold, London Marin 0 S M, Saffran E M, Schwartz M 1976 Dissociations oflanguage in aphasia:

implications for normal function. Annals of the New York Academy of Science 280: 868-884

Marr D 1981 Vision. Freeman, San Francisco Marr D, Nishibara H K 1978 Visual information processing: Artificial Intelligence

and the sensorium ofsight. Technology Review 81: 2-23 McClelland J L, Rumelhart D E 1981 An interactive activation model of context

Page 15: Allport - 1985 - Distributed Memory, Modular Subsystems and Dysphasia


effects in letter perception: Part 1. An account of basic findings. Psychological Review 88: 375-407

Miller G A 1956 The magical number seven, plus or minus two: some limits on our capacity for processing information. Psychological Review 63: 81-97

Minsky M 1979 The Society theory of thinking. In: Winston P H, Brown R H (eds) Artificial Intelligence: an MIT perspective. MIT Press, Cambridge MA

Monsell S 1983 Components of working memory underlying verbal skills: a 'distributed capacities' view. In: Bouma H, Bouwhuis D (eds) Attention and Performance 10. Erlbaum, Hillsdale NJ

Morton J 1980 The Logogen model and orthographic structure. In: Frith U (ed) Cognitive processes in spelling. Academic, London

Mountcastle VB 1978 An organizing principle for cerebral function: the unit module and the distributed system. In: Edelman G M, Mountcastle V B (eds). The mindful brain: cortical organization and the group-selective theory of higher brain function. MIT Press, Cambridge MA

Murdock B B 1979 Convolution and correlation in perception and memory. In: Nilsson L G (ed) Perspectives on memory research. Erlbaum, Hillsdale NJ

Newcombe F, RatcliffG 1979 Long-term psychological consequences of cerebral lesions. In: Gazzaniga M S (ed) Handbook of behavioral neurobiology, Vol 2: Neuropsychology. Plenum, New York

Norman D A (ed) 1981 Perspectives in cognitive science. Erlbaum, Hillsdale NJ Palmer S E 1978 Fundamental aspects of cognitive representation. In: Rosch E,

Lloyd B (OOs) Cognition and categorization. Erlbaum, Hillsdale NJ Patterson K, Coltheart M 1984 Acquired disorders of reading: a psycholinguistic

description. In: Oxbury J, Whurr R, Wyke M, Coltheart M (eds) Aphasia. Butterworth, London

Perrett D I, Rolls E T, Caan W 1982 Visual neurones responsive to faces in the monkey temporal cortex. Experimental Brain Research 47: 329-342

Posner M I, Keele S W 1970 Retention of abstract ideas. Journal of Experimental Psychology 83: 304-308

Ratcliff R 1978 A theory of memory retrieval. Psychological Review 85: 59-108 Schmitt F 0, Worden F G, Adelman G, Dennis S G (eds) 1981 The organization of

the cerebral cortex. MIT Press, Cambridge MA Shallice T 1979 Case study approach in neuropsychological research. Journal of

Clinical Neuropsychology I: 183-211 Shallice T, Warrington E K 1970 Independent functioning of verbal memory stores: a

neuropsychological study. Quarterly Journal of Experimental Psychology 22: 261-273

Shallice T, Warrington E K 1977 Auditory-verbal short-term memory impairment and conduction aphasia. Brain & Language 4: 479-491

Solso R L, McCarthy J E 1981 Prototype formation of faces. British Journal of Psychology 72: 499-503

Taylor A M, Warrington E K 1973 Visual discrimination in patients with localized cerebral lesions. Cortex 9: 82-93

Tulving E 1983 Elements of episodic memory. Clarendon Press, Oxford Warrington E K 1975 The selective impairment of semantic memory. Quanerly

Journal of Experimental Psychology 27: 635-657 Warrington E K, Shallice T 1969 The selective impairment of auditory verbal

shon-term memory. Brain 92: 885-896 Willshaw D 1981 Holography, associative memory and inductive generalization. In:

Hinton G E, Anderson J A (OOs) Parallel models of associative memory, Erlbaum, Hillsdale NJ

Wood C C 1978 Variations on a theme of Lashley: Lesion experiments on the neural model of Anderson, Silverstein, Ritz and Jones. Psychological Review 85: 582-591

Wood C C 1982 Implications of simulated lesion experiments for the interpretation of lesions in real nervous systems. In: Arbib M A, Caplan D, Marshall J C (eds) Neural models oflanguage processes. Academic, New York

3 B. Butterworth

Jargon aphasia: processes and strategies


Jargon is a rare and spectacular manifestation of an aphasic condition. Critchley (1970) defined it as 'a type of speech impair­ment whereby the patient emits a profusion of utterances, most of which are incomprehensible to the hearer, though not perhaps to the speaker.' Words are often quite inappropriate in context; some words are not to be found in the dictionary; syntax is frequently odd and erroneous; empty phrases and circumlocutions abound. Not infrequently, this speech is diagnosed as demented and it is only through the happy intervention of a knowledgeable doctor, speech therapist or psychologist that the patient is saved from psychiatric help. Take the case of Mr K. 'He enjoyed good health until the winter of 1970 when, at the age of 76, his language behaviour suddenly became grossly abnormal. This occurred to such an extent that his next of kin thought he had just been struck by sudden madness and decided, somewhat hastily, to have him interned in a lunatic asylum. Mr K. understood the meaning of this decision and resented it; indeed he never forgot or forgave although he later agreed that his verbal protests could hardly have helped. After a week or so at the asylum, he had the good fortune of being visited by a knowledgeable intern. As a consequence he was trans­ferred to the aphasia unit of la Salpetriere where clinical manifes­tations of a left posterior sylvian softening were observed.' (Lecours et aI, 1981: Case No.1).

In this chapter, I will describe the typical disorders of sentence construction and of words found in jargon, and offer some sugges­tions as to the deficits and compensatory strategies that produce them. In particular, I shall explore the idea, first hinted at by Freud (1891), that these patients suffer from no loss of grammatical or lexical knowledge.

The most striking cases, those containing neologisms, are encountered only infrequently: in a sample of 420 aphasics, Kertesz