Plunkett - Early language acquisition Theories of early ...tween the critical functional anatomy of...

8
WI: + 4’s 1865 271398 f&Y 144 I865 3 10447 r-mad: plunkerr@psy. wford.ac.uk @ I Plunkett - Early language acquisition Theories of early language acquisition Kim Plunkett p&y a crucial role in I studies of developmental change in children promise to contribute to9 riding of how the brain becomes wired up for language. In this r$wi& disciplinary perspectives of cognitive neuroscience, experimental psychoiiiistics and neural network modelling are brought to bear on four distinct arms in the study of language acquisition: early speech perception, word recognition, word learning and the acquisition of grammatical inflections. It is suggested that each: ’ s how linguistic development can be driven by the interaction of i I learning mechanisms, highly sensitive to particular statistical regulariti~:fn thei 2i . = a richly structured environment which provides the necessary ~~~~~ tic primlpleg guiding ahe mrgence of Ii V cry few of the major landmarks in language develop- ment have been tagged to specific aspects of brain develop- ment. Although many critical events in neural development occur during the language learning years and their potential role in language development has been discussed’-3, in most cases the causal link between language development and be- haviour remains unclear. Language development may de- pend on neuroanatomical and neurophysiological changes in the brain. Conversely, language development may reflect a fine tuning of prefashioned, generically-determined struc- tures and a consolidation of cortical representations. Recent neuroimaging and neuropsychological research on infants and children shows that the developing language system undergoes striking changes in neural organization, display- ing exrensive plasticity (see Box l), and that its final layout depends critically on experience. At rhe same time, experi- mentalists have uncovered an increasingly detailed picture of the sophrsticated linguistic propensities of the young in- fant. In this article, I propose that these findings indicate the need for learning mechanisms that are finely tuned to extracting statistical properties of the speech input. Com- putational implementations of these learning mechanisms provide the language acquisition researcher with important clues as to how the brain becomes wired up for language. Early speech perception Newly born infants can discriminate the speech contrasts of all human languages *S Furthermore, their way of cat- egorizing speech sounds is universal, so that a child born to Japanese-speaking parents has the same phonemic category boundaries as a child born to Spanish-speaking parents. For some non-native speech contrasts, the ability to dis- criminate remains intact until fairly late in the first year. For example, infants of 6-8 months of age from an English- speaking background have been shown to be able to dis- tinguish the glottalized velar/uvular stop contrast [k’i]-[q’i] in Nthlakapmx, a language spoken by people in a tribe indigenous to the Northwest Pacific, and the Hindi voice- less aspirated versus breathy voiced contrast [t”]-[dh], neither of which are exploited in English. For English-learning infants, this ability declines after the age of lo-12 months as their phonological processing develop$. For other con- trasts, discrimination sensitivity can decline even earlier. For example, Polka and Werker- have shown that English infants have already lost the ability to discriminate the German vowel contrasts [Yl-(U] by 6-8 months of age. It would appear that sensitivuy to non-native contrasts de- clines earlier for vowels than for consonants*. Interest- ingly, however, not all sensitivities to non-native contrasts Copyright 0 1997, Elrevier Science Ltd All rights reserved. 1364.6613/97/$17.00 PII: 51364.6613(97)01039-5 Trends in Cogmtlve Sciences - Vol. 1, No 4. July 1997

Transcript of Plunkett - Early language acquisition Theories of early ...tween the critical functional anatomy of...

Page 1: Plunkett - Early language acquisition Theories of early ...tween the critical functional anatomy of the developing and the mature language system. All of the most consistent hypotheses

WI: + 4’s 1865 271398 f&Y 144 I865 3 10447 r-mad: plunkerr@psy.

wford.ac.uk

@

’ I

Plunkett - Early language acquisition

Theories of early language acquisition

Kim Plunkett

p&y a crucial role in

I studies of developmental change in children promise to contribute to9 riding of how the brain becomes wired up for language. In this r$wi&

disciplinary perspectives of cognitive neuroscience, experimental psychoiiiistics and neural network modelling are brought to bear on four distinct arms in the study of language acquisition: early speech perception, word recognition, word learning and the acquisition of grammatical inflections. It is suggested that each: ’

s how linguistic development can be driven by the interaction of i I learning mechanisms, highly sensitive to particular statistical regulariti~:fn thei 2i .

= a richly structured environment which provides the necessary ~~~~~

tic primlpleg guiding ahe mrgence of Ii

V cry few of the major landmarks in language develop- ment have been tagged to specific aspects of brain develop- ment. Although many critical events in neural development occur during the language learning years and their potential role in language development has been discussed’-3, in most cases the causal link between language development and be- haviour remains unclear. Language development may de- pend on neuroanatomical and neurophysiological changes in the brain. Conversely, language development may reflect a fine tuning of prefashioned, generically-determined struc- tures and a consolidation of cortical representations. Recent neuroimaging and neuropsychological research on infants and children shows that the developing language system undergoes striking changes in neural organization, display- ing exrensive plasticity (see Box l), and that its final layout depends critically on experience. At rhe same time, experi- mentalists have uncovered an increasingly detailed picture of the sophrsticated linguistic propensities of the young in- fant. In this article, I propose that these findings indicate the need for learning mechanisms that are finely tuned to extracting statistical properties of the speech input. Com- putational implementations of these learning mechanisms provide the language acquisition researcher with important clues as to how the brain becomes wired up for language.

Early speech perception Newly born infants can discriminate the speech contrasts of all human languages *S Furthermore, their way of cat- egorizing speech sounds is universal, so that a child born to Japanese-speaking parents has the same phonemic category boundaries as a child born to Spanish-speaking parents. For some non-native speech contrasts, the ability to dis- criminate remains intact until fairly late in the first year. For example, infants of 6-8 months of age from an English- speaking background have been shown to be able to dis- tinguish the glottalized velar/uvular stop contrast [k’i]-[q’i] in Nthlakapmx, a language spoken by people in a tribe indigenous to the Northwest Pacific, and the Hindi voice- less aspirated versus breathy voiced contrast [t”]-[dh], neither of which are exploited in English. For English-learning infants, this ability declines after the age of lo-12 months as their phonological processing develop$. For other con- trasts, discrimination sensitivity can decline even earlier. For example, Polka and Werker- have shown that English infants have already lost the ability to discriminate the German vowel contrasts [Yl-(U] by 6-8 months of age. It would appear that sensitivuy to non-native contrasts de- clines earlier for vowels than for consonants*. Interest- ingly, however, not all sensitivities to non-native contrasts

Copyright 0 1997, Elrevier Science Ltd All rights reserved. 1364.6613/97/$17.00 PII: 51364.6613(97)01039-5

Trends in Cogmtlve Sciences - Vol. 1, No 4. July 1997

Page 2: Plunkett - Early language acquisition Theories of early ...tween the critical functional anatomy of the developing and the mature language system. All of the most consistent hypotheses

Box 1. Language development in children with focal lesions Children with early unilateral brain injury to either brain hemi- sphere typically go on to achieve levels of language performance that are within the normal range. Earlier reports on recovery of language in children with focal brain injury led some investiga- tors to conclude that initially the two hemispheres are equipo- tential for language”. This remarkable plasticity of the develop- ing language system was investigated at different stages of language acquisition in a recent neuropsychological stndy with a large sample of subjects. Bates etalb reported on the effects of focal brain injury during the early development of language in 53 children. All children had suffered a single, unilateral brain injury to the left or right hemisphere, incurred before six months of age. The results indicated a striking difference be- tween the critical functional anatomy of the developing and the mature language system. All of the most consistent hypotheses regarding language organization in the adult were violated, namely: left hemisphere dominance, left temporal involvement in language comprehension and left frontal involvement in ex-

pressive language functions and grammar. Studies on infants aged IO-17 months showed that children with right hemi- sphere injuries are at a greater risk of comprehension impair- ments. Children with left temporal lesions showed significantly greater delays in expressive vocabulary and grammar between the ages of 10 and 44 months. Frontal lesions to either hemi- sphere disrupted the typical dramatic increase in vocabulary growth observed around 21 months, independently of motor impairments. The implication of this important study is that the localization of language and other cognitive functions in the adult may reflect the developmental or experience-dependent status of a behaviour.

References

a Lenneberg. E.H. (1967) Biological Foundations of Language,

Wiley

b Bates, E. et al. From first words to grammar in children with focal

brain injury Dev. Neuropsychol. (in press)

decline in this fashion. Best et al’ have shown that even English-learning infants aged 12-14months are able to discriminate Zulu click contrasts, as are English-speaking adults.

The mechanisms by which the newly born infant can discriminate all human speech sounds, and the process by which the child becomes attuned to the parental lan- guage are not well understood. Dehaene-Lambertz and Dehaene’” have shown that auditory evoked related po- tentials (ERPs) can be used to unravel the temporal and spatial organization of the neuronal processes underlying phoneme discrimination. They played two-month-old in- fants synthesized speech stimuli as groups of five syllables (e.g. /ba/, lbal, Ibal, Ibal, /pa/) where the first four syllables were identical (the standard) and the fifth was either iden- tical or phonetically different (deviant). A significant differ- ence in auditory ERPs between standard and deviant stim- uli showed that the infant could discriminate the deviant stimuli. Christophe and Morton” suggest that this tech- nique might be used to study the developmental profile of responses to native and non-native contrasts, thereby shedding light on whether the brain is still sensitive to non-native speech contrasts, but ignores the information, or whether the ability to discriminate non-native speech contrasts is truly lost.

Auditory ERPs offer an important new tool for study- ing which aspects of the acoustic stimulus young infants are sensitive to, when they are sensitive to it and even, in some cases, the parts of the brain that are the most revealing of these discriminatory capacities. However, ERP measure- ments are unlikely to tell us how the brain actually accom- plishes the task. Nakisa and PlunketP have developed a neural network model of early phonological development. The model is based roughly on Jusczyk and Bertoncini’sr3 proposal that the development of speech perception should be viewed as an innately guided learning process: learning the speech contrasts of the native language takes place rapidly because the system is innately structured to be sensi-

tive to correlations of certain distributional properties of the speech stimulus and not others.

The model is based on vertebrate neuronal develop- ment. Neurons are allowed to migrate, grow axons and synapses under the control of genes for various trophic fac- tors. Other genes then control the means by which synapses are modified by experience. A population of these neural networks is generated and allowed to breed, with a selective pressure for networks that respond in the desired way to speech sounds. Network fitness is calculated using the stored output unit activities after the network has been ex- posed to a test set of spoken English sentences. The fitness function favoured networks that represented occurrences of the same phoneme as similarly as possible and different phonemes as differently as possible. When, after many gen- erations of this evolutionary process, one of these neural networks is exposed to speech spectra from any of 14 human languages (including English, Cantonese, Swahili, Farsi, Czech, Hindi, Hungarian, Korean, Polish, Russian, Slovak, Spanish, Ukrainian and Urdu), it rapidly modifies its connections and creates a representation of speech sounds that is the same regardless of the language to which it has been exposed. Furthermore, the internal represen- tations of speech in the network show the same categorical boundaries that are observed in adult and infant perception (see Box 2). Once a network architecture has been selected by the evolutionary process, only two minutes of speech are required to train the network.

The innately guided learning exhibited by this network enables it to learn very quickly and makes it less dependent on the ‘correct’ environmental statistics. The model offers an account ofhow infants from different linguistic environments can learn the same featural representation so soon after birth. In this sense, innately guided learning as implemented in this model is half-way between nativism and construc- tivism. It shows how genes and the environment can inter- act to ensure rapid development of a featural representation of speech on which further linguistic development depends.

Trends in Cognitive Sciences - Vol. 1, No. 4, July 1997

Page 3: Plunkett - Early language acquisition Theories of early ...tween the critical functional anatomy of the developing and the mature language system. All of the most consistent hypotheses

$3

,, (, Plunkett - Early language acquirltion

$ : i I

$~ibiqp-;~i~~~~~~~~.~ i ‘h.:w&:: :sw !a?=ie2&8*3@

Box 2. Categorical perception in the Nakisa and Plunkett” model

Categorical perception of phonemes is a robust phenom- enon observed in both infants and adults. The nerwork was tested on a series of 11 spectra which fnrmed a linear con- tinuum from the pure ishl to a pure Is/. Individually, each of the 11 spectra in the continuum were fed into a network that had been trained on 30 sentences of continu- ous speech in English. The output feature responses were stored for each spectrum in the continuum. The distances of rhese feature vectors from the pure lshi and pure /s/ indi- cated the categorical nature of the network’s internal repre- sentations of the speech spectra, as shown in the figure below.

All of the human languages tested seemed to be equally effective for training the network co represent English speech sounds. To see whether any sounds could be used for training, the network was trained on white noise. This re- sulted in slower learning and a lower final fitness. The fit- ness for a network trained on white noise neyer reached that of the same network trained on human speech. An even worse impediment tn learning was to train on low-pass fil- tered human speech.

012 3 4 5 6 7 8 910

/S/ Sample /Sh/

Reference

a Nakisa, R.C. and Plunkett. K. Innately guided learning by a

neural network: the care of featural representation of speech

Lang. Cogn. Processes (in press)

Word recognition During the first year of life, infants become attuned to more than the phonemic contrasts of their native language. They pick up knowledge that enables them to identify words and other linguistic units in speech. Jusczyk and A.&t’* have used the familiarization-preferential looking procedure (see Box 3) to demonstrate that even infants aged seven and a half months have some ability to detect words in fluent speech. Jusczyk et al. ” showed that nine-month-old American and Dutch infants prefer to listen to word lists that conform to the phonetic and phonotactic structure of their own language. In contrast, six-month-old American infants showed no preference for lists from either language.

There is also evidence that infants are sensitive to the prosodic organization of their native language. Jusczyk et al.‘” report that prelinguistic infants have identified a regu- larity of English wherein disyllabic words tend to adhere to a trochaic (strong-weak) stress pattern”. Newsome and Jusczyk’* show that infants aged seven and a half months can use this knowledge to segment disyllabic words from the main speech stream. Young children are also more likely to imitate syllables that are stressed or word-final than sylla- bles that are both unstressed and nonfinal”. Fernald et aLao, using a preferential looking task, have shown that infants are more likely to recognize familiar words in utterance- initial or final position than when the word occurs in the middle of the utterance.

Saffran et al.” have focused on the ability of young chil- dren to acquire linguistic structure via statistical cues. They point out that the statistical properties of multisyllabic words are potentially useful for infant word segmentation. Over a corpus of speech sounds, there are measurable regu- larities that distinguish those recurring sound sequences that comprise words from the more accidental sound se- quences which occur across word boundaries. Using the fa- miliarization-preferential looking procedure (see Box 3), Saffran et A*’ showed that eight-month-old infants are able to perform the necessary statistical computations. Following a two minute exposure to a synthetic speech stream containing only statistical cues to word boundaries, the infants’ listening preferences demonstrated that they had extracted and remembered serial order information about the familiarization items, distinguishing ‘words’ (re- current syllable sequences) from syllable strings spanning word boundaries. This preferential behaviour indicates that the infants computed the co-occurrence frequencies for pairs of sounds across the familiarization corpus. Jusczyk et al.*’ report that nine-month-old infants (but not six- month-olds) are attentive to the frequency with which phonotactic sequences occur within English. These results, together with the findings of Saffran et al*‘, suggest that in- fants have access to a powerful mechanism for the compu- tation of statistical properties of the language input even from very brief exposures and that this develops sometime between the age of six and nine months. They indicate that infants may be far better at deriving structure from statisti- cal information than has often been assumed in the acquisi- tion literature. In particular, certain aspects of language that are argued to be unlearnable and thus innately specified may be discoverable by appropriately constrained statistical learning mechanisms.

Recent connectionist and statistical analyses23-*5 of the properties of real language corpora have contributed to the view that the distributional information in the input may be very useful to the language learning child. For example, Christiansen et ~1.‘” trained a simple recurrent network on a phoneme prediction task. The model was explicitly pro- vided with information about phonemes, relative lexical stress and boundaries between utterances. Individually, these sources of information provide relatively unreliable cues to word boundaries and no direct evidence about ac- tual word boundaries. After training on a large corpus of child-directed speech, the model was able to use these cues

Trends in Cognltlve Sciences - Vol. 1. No. 4. July 1997

Page 4: Plunkett - Early language acquisition Theories of early ...tween the critical functional anatomy of the developing and the mature language system. All of the most consistent hypotheses

The familiarization-preference procedure was developed by Jusczyk and A.&n’. In this procedure, infants are exposed to au- ditory material that serves as a potential learning experience. Subsequently, they are presented with two types of test stimuli: (a) items that were contained within the familiarization mater- ial, and (b) items that were highly similar but are not contained within the familiarization material. During a series of test trials that immediately follows familiarization, infants control the

duration of each test trial by their sustained visual fixation of a blinking light. If infants have extracted the crucial information about the familiarization items, they may show differential du- rations of fixation (listening) during the two types of test trials.

Reference

a Jusczyk, P.W. and Aslin, R.N. (1995) Infant’s detection of sound

patterns of words in fluent speech Cognitive Psycho/. 29, 1-23

R

Child

Parent

Plunkett - Early language acquisition

Video camera

8 Flashing light

I Loudspeaker

to reliably identify word boundaries. The model shows that aspects of linguistic structure that are not overtly marked in the input can be derived by efficiently combining multiple probabilistic cues.

Word learning One of the most dramatic manifestations of language devel- opment during rhe first two years of life is the rapid increase in the rate of vocabulary development observed at around 18 months of age. This spurt in development usually occurs in both comprehension and production”. There are three main families of theories about the mechanism underlying this vocabulary spurt. These are linguistic development2’-z9, conceptual development30~3’ and the development of con- straints on word learnin$z-34. All of these theories postulate the triggering of a new principle of organization into the child’s understanding of the object/label relationship. Woodward et aLj5 argue that these explanations imply that learning a new word prior to the vocabulary spurt is likely to be a time-consuming process, requiring considerable expo- sure to a new word. There is growing evidence, however, that the young (prevocabulary spurt) child may not be as hampered in learning new words as was thought previously. Woodward et a1.35 report that, under favourable circum- stances, 13-month-old infants can learn novel words from as few as nine presentations of a novel word token. BaldwiG argues thar joint attention between the infant and its instructor is necessary for word learning. However,

Schafer and PlunketP have succeeded in replicaring the findings of Woodward et al. under conditions that do not require the presence of an instructor, suggesting that the pre- vocabulary spurt child is already equipped with a powerful learning mechanism for forming object-label associations.

Imaging early language development Event-related potentials have been used to examine devel- opmental changes in neural processing in normal children. Mills et aL3’ examined the changes in the organization of brain activity linked to comprehension of single words in infants aged from 13 to 20 months. Auditory evoked related potentials were recorded as children listened to both a series of words whose meanings they did or did not understand, and to words pronounced backwards. The ERPs differed as a function of word comprehension within 200ms after word onset. At 13-17months of age comprehension- related differences were bilateral and broadly distributed over the anterior and posterior cortex. In contrast, at 20 months of age these effects were limited to temporal and parietal regions of the left scalp.

These results indicate that the neural organization for word comprehension shifts, and that this shift occurs precisely during the period of development when language acquisition is most pronounced. The implication is that plasticity and reorganization may be natural properties of the developing system and are not restricted to compensatory changes in damaged brains. Mills et aL3* suggest that aspects of their

Trends in Cognitive Sciences - Vol. 1, No. 4. July 1997

Page 5: Plunkett - Early language acquisition Theories of early ...tween the critical functional anatomy of the developing and the mature language system. All of the most consistent hypotheses

Plunkett - Early language acquisition

Box 4. Multimodal model of vocabulary acquisition (A) Profile of vocabulary scores typical for many children during their second year (taken from Ref. a). Each data point indicates the number of different words used by the child during a recording session. The vocabulary spurt that occurs around 22 months is observed in many children. It usually consists of an increased rate of acquisition of nominals, specifically names for object&. (B) Simplified version of the network architecture used in Plunkett et al.’ The image is filtered through a retinal pre- processor prior to presentation to the network. Labels and images are fed into the network through distinct ‘sensory’ channels. The network is trained to reproduce the input patterns at the output, a process known as autoassociation. Production corresponds to generating a label at the output when an image is presented at the input. Comprehension corresponds to genet-

A

140 Jens’ vocabulary

OJ .-a’ I’ 1’ I’ I II 12 14 16 18 20 22 24

Age (months)

ating an image at the output when a label is presented at the input. The model exhibits the same non-linear pattern of vocabulary growth observed in young children, both in com- prehension and production. Furthermore, comprehension scnres are always ahead of production scores. The network model also produces over- and under-extension errors.

References

a Plunkett, K. (1993) Lexical segmentation and vocabulary growth in

early language acquisition J. Child Lang. 20. 43-60

b McShane, J. (1979) The development of naming Linguistics 17,

879-905

c Plunk&, K. et al. (1992) Symbol grounding or the emergence of

symbols? Vocabulary growth in children and a connectionist net

Connect sci 4.293-312

B Preprocessed image Label

Preprocessed image Label

ERP findings are linked to changes in early lexical develop- ment that occurs typically between 13 and 20 months of age. However, it is still unclear whether the changes in ERP reflect qualitative changes in the underlying language processing of lexical items or a consolidation of existing lexical representations. Schafer and Plunkett” have devel- oped a technique based on the preferential-looking task for training infants on novel words. Combining novel word learning with ERP measurements offers an opportunity to evaluate whether the observed hemispheric specialization for lexical processing arises from prolonged experience with words or from the development of new cognitive processing strategies.

Connectionist mode&g of non-linear word learning Recent work in connectionist modelling has demonstrated that small and gradual changes in a network, not involving the maturation of new systems, can lead to dramatic non- linearities in its behaviour. For example, Plunkett et al” have developed a connectionist model of children’s vocabu- lary development that involves an auto-associative process of relating labels to images. Although the training of the model involved small continuous changes in the connection strengths within and across the different processing modal- ities in the network (see Box 4), the linguistic behaviour of the network exhibited dramatic nonlinearities in vocabulary development, mimicking the well known vocabulary spurt

that occurs in young children towards the end of their sec- ond year. Furthermore, the model showed clear-cut dissoci- ations in receptive and expressive vocabulary development, suggesting that the asymmetries between comprehension and production that are often observed in young children may be a natural outcome of the child’s attempt to integrate multiple representations. This modelling work suggests that the results observed in children with focal lesions (see Box 1) and in ERI? studies of word comprehension need not necessarily imply prewired, dedicated modules. The results are entirely consistent with the non-linear onset of overt behaviours linked to gradual experience-driven learning processes and coordination of multiple representations in the underlying neural system.

Inflectional morphology Symbolic accounts of the acquisition of inflectional mor- phology40.4’ assume a dual-route mechanism for the pro- cessing of regular and exceptional words: a rule-governed process attempts to inflect all words, while an associative memory attempts to identify the exceptions to the rule and block its application. For example, in this view, the plural formation of ‘sheeps’ is blocked by the identification of the exceptional plural form ‘sheep’ in associative memory whereas plural formation of ‘boys’ is achieved by appli- cation of the rule (add Is/) to the word ‘boy’. The rule- governed process acts as a default that applies to any word,

Trends in Cognltlve Sciences - Vol. 1. No. 4, July 1997

Page 6: Plunkett - Early language acquisition Theories of early ...tween the critical functional anatomy of the developing and the mature language system. All of the most consistent hypotheses

offering the language user economy in representation (no need to store information about inflected forms that con- form to the default) and creativity (the capaciry to inflect forms previously not encountered).

In contrast, connectionist accounts of the acquisition of inflectional morphology4246 assume a single-route mecha- nism for the processing of both regular and exceptional forms. There is no distinction in the manner in which regular and exceptional forms are handled in this account. They are processed by the same network of connections that maps an uninflected form of the word to its inflected form. The network’s capacity to inflect novel forms is shaped by its experience with the forms on which it has already been trained.

In English, the inflectional systems ofthe past tense and the plural are highly regular. Irregular past tense forms and irregular noun plurals constitute only 14% and 2% of their respective systems47. The dual-roure account of inflectional morphology is very efftcient at representing these systems as only a minority of forms need to be stored in associative memory and the default rule can deal with the majority of forms. A connectionist network stores information about all the words. Nevertheless, the dominance of the regular words in the system results in the network producing regu- lar responses to novel words. Consequently, both dual- route and connectionist approaches can explain the prepon- derance of regular responses to novel words by English speakers but for different reasons: the dual-route account exploits a default rule that attempts to regularize any word available to the language user; the connectionist account ex- ploits the skewed distribution in favour of regular words in the language.

Minority defaults There is evidence from speakers of other languages that their ability to produce a default response to novel words or overgeneralize the default to exceptional words does not rely upon a numerical superiority of the words that epitomize the default in the language. For example, Clahsen et al.** and Marcus et aL.*’ claim that the ‘s’ plural in German is the default process even though it constitutes a minority of the plural forms in the language. A similar claim is made for the default status of the ‘sound’ plural in Arabic. These authors claim that languages whose speakers conform to a minority default pattern, appear to present a major challenge to con- nectionist accounts of inflectional morphology as networks operating on the principle of ‘similar inputs produce similar outputs’ are unlikely to produce a default response to novel forms.

Plunkett and Nakisa5’ trained a neural network on the Arabic plural and evaluated its performance on words not encountered in the training set. They showed that the net- work was superior to the dual-route model at predicting the plural class of Arabic words on which it had never been trained. In particular, prediction of membership in the sound plural class was more accurate in the neural network model. In a similar fashion, Nakisa et ~1.~~ have shown that a connectionist network trained on a subset of German plu- rals predicts accurately the class membership of German plurals that it has never seen before. The network is in much the same position as the Arabic or German child who may have to guess how to form the plural of a word. These re- sults indicate that the distribution of nouns in Arabic and German may provide subtle clues to plural class member- ship which are not obvious even to sophisticated profes- sional linguists.

Conclusions As yet, none of the domains of language acquisition de- scribed above are understood properly. However, the picture of the language learning child is becoming increas- ingly refined as we uncover the details of what is developing and when development occurs, where the neural systems in the brain for controlling linguistic behaviours are located and how these systems actually function. Behavioural, neuropsychological and computational studies reveal that the young infant is richly endowed with neural systems, well-adapted to the business of linguistic information pro- cessing. At the same time, I believe that a multidisciplinary approach to the study of language acquisition points to the utility of viewing linguistic development as driven by the interaction of powerful general learning mechanisms with a richly structured environment that provides the necessary ingredients for the emergence of mature linguistic representations.

Hare et alSo have demonstrated that connectionist models of inflectional morphology can learn a default re- sponse even in the absence of superior numbers for the de- fault class. Two factors contribute to a networks capacity to respond in a default-like fashion: First, words which look similar at the input need not have similar internal represen- tations. Second, the distribution of the words in the lan- guage influences the ability of the network to act in a de- fault-like fashion (see Box 5). Under appropriate conditions (see Box 5), it is possible for a network to learn a distribu- tional default.

Outstanding questions

l What are the mechanisms that enable infants to tune in to the speech contrasts that are specific to their native language? Why do some non- native speech contrasts remain discriminable by adults and children while others are no longer perceived as distinct? Does the brain remain sensitive to non-native speech contrasts even though discrimination experiments demonstrate a lack of such sensitivity?

l Are the word-like linguistic chunks that infants extract from continuous speech stored in a prelexical mental repository awaiting adequate conceptual development to achieve semantic grounding? How does conceptual development influence the process of lexical segmentation? What developments underpin the dramatic changes in the statistical processing of speech that seems to occur between six and nine months of age?

l What is the nature of the label-object associations shown in recent demonstrations of rapid word learning in prevocabulary spurt infants? Is the lateralization of early word representations to the left hemisphere in postvocabulary spurt infants a consequence of prolonged experience with specific words or is it due to the emergence of new lexical processing strategies?

l What are the facts of acquisition in complex inflectional systems like the Arabic and German plural? Are there default processes operating in these and other languages or are over-regularizations and generalizations better understood as operating through processes governed by analogy and frequency?

Trends in Cognitive Sciences - Vol. 1, No 4. July 1997

Page 7: Plunkett - Early language acquisition Theories of early ...tween the critical functional anatomy of the developing and the mature language system. All of the most consistent hypotheses

Box 5, How to obtain a minority default from a neural network

lkett’ trained a neural network to categorize elonging to one of three classes. Each input a point on a two-dimensional plane. The

!, ‘I points is shown below. The majority of the ,d in two squares. All the points within a

II, L to belong ro the same class. These can be toI’ ~ exceptional patterns. The minority of the /,I I uted outside these square regions. All the

,~Ol,, 1,: square regions belong to the same class. HIC\. ‘8 I ught of as the patterns representing the .lL~l.illll ‘8 ~rere) class. The question of interest is how ,I,( j s,,, 11 (work trained on this distribution of points ‘i~ll<liil. 88 I patterns, that is, the points in the two- I ,I.,< I,\,18 I ’ I IL. on which it has never been trained? The il.irl/lcl II , 81 J\ used by Forrester and Plunkett’ contained i:v ,.,I ‘,/I I, 10 specie the x, y coordinates in the two- <II u:,k 1111 / 181 ‘II’, 20 hidden units that formed internal rep- ii’~i,,I.,ll~~’ or IIX input patterns permitting a non-linear I d\\ltik III, 1’ /he input space and three output units to L1.~t\~t~ I/C III,T\I: patterns. The network was trained with the ,v1111\ ~/ t ‘,,I1 )i low and then tested on every point in the }‘I in. I : j 1, !‘,I.<’ plots show the activation of the three classi- &I UWII .I il~:t~lmt stages in training. Darker regions indicate I~‘+\~ iiil’b.i, lill The final column (late training) shows that ,111 !h <I I\\I’I,:~ lmits do a good job at partitioning the space lx :hc irlc: (I/ II Inning. In particular, most of the points in the r\~o-~i~rrli.tl\~,~n.~l plane are treated as though they belong to I~I< rhll-,t ~ 1.a.~. I he so-called default, even though the training \a Lon~S~un~~ I ,/ nlinority of forms in this class. This example ~~I:~oI-~~I~~,I~~~ II~\v a neural network can be trained to produce A d~i‘:iilr IIL response provided it has the resources to U~IIQ:~I~I 111~ rn.11 representations that permit a non-linear par- ~I~I~~KIC 1st 111~ mput space and provided the forms in the ‘l.~ngu.~gc’ AI t. .i~lpropriately distributed.

Reference

a Forrester, N and Plunkett, K. (1994) Learning the Arabic plural.

the case tor minority default mappings in connectionist networks,

I” Proceedings of the Sixteenth Cognitive Science Society Annual

Conference. Atianta, GA (Vol. 16) (Ram, A. and Eiselt, K.. eds),

pp. 319-323. Erlbaum

Early

Period of training network

Middle Late

+++iL+&- - -- SM -w- - _ - oo& +++++m 0000000000- - - 8wm - ooooooomo t2fim 0000oooooO ooocloooooo _ - oooooooooo - --

+ Exception 2 0 Exception 1 - Minority default

. . . . . . . . . . . . . . References

1 Bates, E., Thal, D. and Janowsky, J. (1992) Early language development

and its neural correlates, in Handbook of Neuropsychology (Vol. 7)

(Rapin, I. and Segalowitz. S., eds), pp. 669-710, Elsevier

2 Elman. J.L. et al. (1996) Rethinking Innateness: a Connectionist

Perspective on Development, MIT Press

3 Quaftz. S.R. and Sejnowski. T J. The neural basis of cognitive

development: a constructiwst manifesto Behav. Brain Sci. (in press)

4 Eimas, P.D. et al. (1971) Speech perception m Infants Science 171,

303-306

5 Jurczyk, P.W. (1992) Developing phonological categories from the

speech signal, in Phonological Development: Models, Research,

implications (Ferguson, C.A., Menn, L. and Steel-Gammon C.. eds),

pp. 17-62, York Press

6 Werker, J.F. and Tees, R.C. (1984) Cross language speech perception:

evidence for perceptual reorganiration during the first year of life

Infant Behav. Dev. 7.49-63

7 Polka, L. and Werker, J.F. (1994) Developmental changes in perception

of non-native vowel contrasts J. Exp. Psycho/. Hum. Percept. Perform.

20.421-435

6 Jurczyk, P.W. (1996) The Discovery of Spoken Language, MIT Press

9 Best, C.T.. McRoberts, G.W. and Sithole. N.M. (1988) Examination of

the perceptual reorganization for speech contrasts: Zulu click

discrimination by English-speaking adults and infants 1. Exp. Psycho/.

Hum. Percept. Perform. 14, 345-360

10 Dehaene-Lambertz, G. and Dehaene, S (1994) Speed and cerebral

correlates of syllable discrimination in infants Nature 370,292-295

11 Christophe, A. and Morton, J. (1994) Comprehending baby-think

Nature 370, 250-251

12 Nakisa. R.C. and Plunk&t, K. Innately guided learning by a neural

network: the case of featural representatw of speech Lang. Cog”.

Processes (in press)

13 lusczyk, P.W. and Bertoncini, 1. (1988) Viewing the development of

speech perception as an innately guided process Lang. Speech 31,

217-238

14 Jusczyk, P.W. and Aslin, R.N. (1995) Infant’s detection of sound

patterns of words in fluent speech Cognih’ve Psycho/. 29, l-23

15 Jusczyk, P.W. et al. (1993) Infants’ sensitivity to the sound patterns of

native language words 1. Mem. Lang. 32,402-420

16 Jusczyk. P.W., Cutler, A. and Redanz, N. (1993) Preference for the

predominant stress patterns of English words Child Dev. 64,675-687

Trends in Cognitive Sciences - Vol. 1. No. 4, July 1997

Page 8: Plunkett - Early language acquisition Theories of early ...tween the critical functional anatomy of the developing and the mature language system. All of the most consistent hypotheses

17 Cutler, A. and Caner, D.M. (1987) The predominance of strong mltlal

syllables in the English vocabulary Comput. Speech Lang. 2, 133-142

18 Newsome, M. and Jurczyk, P.W. (1995) Do Infants use stress as a cue

for segmenting fluent speech? in 19th Boston University Conference

on Language Development (MacLaughlin, D. and McEwen, S., eds),

pp. 415426, Caxadilla Press

19 Echols. C.H. and Newport, E.L. (1992) The role of stress and position in

determining first words Language Acquisition 2, 16%220

20 Fernald, A., McRoberts, G.W. and Herrera, C. Effects of prosody and

word position on infants’ ability to recognize words in fluent speech

J. Exp. psycho/. Learn. Mem. Cogn. (in press)

21 Saffran. J.R., Aslin, R.N. and Newport, E.L. (1996) Statistical learning by

B-month-old infants Science 274. 1926-1928

22 Jurczyk, P.W., Lute, P.A. and Lute, J.C. (1994) Infant’s sensitivity to

phonotactic patterns in the native language J. Mem. Lang. 33,630645

23 Brent, M.R. and Cartwright, T.A. (1996) Distributional regularity and

phonotactic constraints are useful for segmentation Cognition 61,

93-125

24 Christiansen, M.H., Allen, J. and Seidenberg, MS. Learning to segment

speech using multiple cues: a connectionist model Lang. Cogn.

Processes (in press)

25 Reddington. M. and Chater. N. Connectionist and statistical

approaches to language acquisition: a distributional perspective Lang.

Cogn. Processes (in press)

26 Goldfield. E. and Reznick, 1.5. (1992) Rapid change in lexical

development in comprehension and production Dev. Psycho/. 28,

406413

27 Dare, 1. (1978) Conditions for the acquisition of speech acts, in The

Social Context of Language (Markova, I., ed.), pp. 87-111, Wiley

20 Lock, A. (1980) The Guided Reinvention of Language, Academic Press

29 Plunkett, K. (1993) Lexical segmentation and vocabulary growth in

early language acquisition J. Child Lang. 20, 43-60

30 Corrigan, R. (1978) Language development as related to stage 6 object

permanence development J. Child Lang. 5, 173-189

31 Gopnik, A., and Meltzoff, A. (1987) The development of

categorization in the second year and its relation to the other

cognitive and linguistic developments Child Deb’. 58, 1523-1531

32 Markman. E.M. (1991) The whole object, taxonomic and mutual

exclusivity assumptions as initial constraints on word meanings. in

Perspectives on Language and Thought: Interrelations and

Development (Bymes, J.P. and Gelman, S.A.. eds), pp. 72-106,

Cambridge University Press

33 Golinkoff. R.M. et al. (1992) Young children and adults use lexical

principles to learn new nouns Dev. Psycho/. 28, 99-108

34 Clark, E.V. (1993) The Lexicon in Acquisition, Cambridge University

Press

35 Woodward, A.L., Markman. E.M. and Fitzsimmons, C.M. (1994) Raptd

word learning in 13. and 18.month-olds Dew Psycho/. 30. 553-566

Plunkett - Early language acquisition

36 Baldwin, D.A. (1996) Understanding the link between joint attention

and language, in Joint Attention: its Origin and Role in Development

(Moore, C. and Dunham, P., edr), Erlbaum

37 Schafer, G. and Plunkett, K. Rapid word learning by

1%month-olds under tightly controlled conditions Center Res. Lang.

Ned. (in press)

38 Mills, D.L., Coffey-Corina, S.A. and Neville. H.J. Language comprehension

and cerebral specialization from 13 months to 20months Dev.

Neuropsychol. (in press)

39 Plunkett, K. et a/. (1992) Symbol grounding or the emergence of

symbols? Vocabulary growth in children and a connectionist net

connection sci. 4. 293-312

40 Marcus, G.F. et a/. (1992) Overregularization in language acquisition

Monogr. Sot. Res. Child Dew 57. l-182

41 Pinker, 5. and Prince, A. (1988) On language and connectionism:

analysis of a parallel distributed processing model of language

acquisition Cognition 28. 73-193

42 Rumelhart, D.E. and McClelland, J.L. (1986) On learning the past tense

of English verbs, in Parallel Distributed Processing: Explorations in the

Microstructure of Cognition (McClelland, J.L. and Rumelhart, D.E..

eds), pp. 216271, MIT Press

43 MacWhinney, B. and Leinbach. J. (1991) Implementations are not

conceptualizations: revising the verb learning model Cognition 40,

121-157

44 Plunkett, K. and Marchman, V. (1991) U-shaped learning and

frequency effects in a multi-layered perceptron: implications for child

language acquisition Cognition 38,43-102

45 Plunkett. K. and Marchman, V. (1993) From rote learning to system

building: acquiring verb morphology in children and connectionist

nets Cognition 48, 2169

46 Plunk&t, K. and Marchman. V. (1996) Learning from a connectionist

model of the acquisition of the English past tense Cognition 61,

299-30s

47 Marcus, G. (1995) Children’s overregularization of English plurals: a

quantitative analysis J. Child Lang. 22, 447459

48 Clahsen, H. et al. (1992) Regular and Irregular Inflection in the

acquisition of German noun plurals Cognition 45, 225-255

49 Marcus, G. et al. (1995) German inflection: the exception that proves

the rule Cognitive Psycho/. 29, 189-256

50 Hare. M., Elman, J.L. and Daugherty, K.G. (1995) Default

generalization in connectionist networks Lang. Cogn. Processes 10,

601630

51 Plunkett, K. and Nakisa, R.C. A connectionist model of the Arabic

plural system Lang. Cogn. Processes (in press)

52 Nakisa, R.C.. Plunkett, K. and Hahn, U. A cross-linguistic comparison of

single and dual-route models of inflectional morphology, in Models of

Language Acquisition: Inductive and Deductwe Approaches (Breeder,

P. and Murre, J., eds). MIT Press (in press)

In the September issue... . The Trends Guide to the Internet (1997)

Feeling left behind? Worried about time-wasting? This free guide will lead you through those ail-important first steps towards using the Internet efficiently and effectively. Written by an international panel of experts, the guide will tell you what to expect and where to find it. Articles include:

The origins of the Internet Key terms: the jargon explained

Basic Internet facilities and how to connect The World Wide Web

USENET: setting up and joining newsgroups Creating your own home page

FTP: how to retrieve files from around the world What’s new in online journals

Where to go next for help

Also including a poster listing useful sites to visit.

For further information concerning bulk sales, contact Thelma Reid: e-mail, t.reidBelsevier.co.uk; Tel, +44 1223 311114; Fax, +44 1223 321410.

Trends in Cognitive Sciences - Vol. 1. No. 4. July 1997