Neurobiological Evidence for Abstract Phonological ... · Neurobiological Evidence for Abstract...

7
Neurobiological Evidence for Abstract Phonological Representations in the Mental Lexicon during Speech Recognition Carsten Eulitz and Aditi Lahiri Abstract & A central issue in speech recognition is how contrastive phonemic information is stored in the mental lexicon. The conventional view assumes that this information is closely re- lated to acoustic properties of speech. Considering that no word is ever pronounced alike twice and that the brain has limited capacities to manage information, an opposing view proposes abstract underspecified representations where not all phonemic features are stored. We examined this proposal using event-related brain potentials, in particular mismatch negativity (MMN), an automatic change detection response in the brain that is sensitive to language-specific phoneme representations. In the current study, vowel pairs were presented to subjects, reversed as standard and deviant. Models not assuming underspecification predict equal MMNs for vowel pairs regardless of the reversal. In contrast, en- hanced and earlier MMNs were observed for those condi- tions where the standard is not phonologically underspecified in the mental representation. This provides the first neuro- biological evidence for a featurally underspecified mental lexicon. & INTRODUCTION The mental lexicon is a part of the declarative memory containing all information necessary for efficient recogni- tion of speech in humans. A vital issue in linguistics and brain research is the nature of the mental representation involved in the processing of sounds of natural language. From a linguistic point of view, not all features that can be extracted from the acoustic signal are required for recog- nizing speech sounds. Consequently, all predictable and nondistinctive information can be excluded from the mental lexicon (Kiparsky, 1993; Lahiri & Marslen-Wilson, 1991; Lahiri & Reetz, 2002; Keating, 1988). The under- specified approach, spelled out in the featurally under- specified lexicon (FUL) model (Lahiri & Reetz, 2002), serves to explain how the perceptual system handles acoustic variance of single words across speakers and contexts beyond the possibility of merely storing all vari- ance (Bybee, 2001). A word like ‘‘ten’’ can be pronounced as ‘‘tem’’ in ‘‘te[m] bags’’ or ‘‘teng’’ in ‘‘te[w] gates’’ (Fitzpatrick & Wheeldon, 2002). FUL assumes just a single representation for all the variants, namely ‘‘ten,’’ with an /n/ that is not specified for its place of articulation, which is [CORONAL]. This underspecified /n/ does not conflict with the contextual variants [n], [m], and [w]. Such an equivocal representation is, however, not the case for all sounds. Whereas words ending with /n/ have contextual variants, words ending with /m/ like ‘‘cream’’ (place of articulation [LABIAL]) will not become ‘‘crea[w]’’ in ‘‘crea[w] car’’ or ‘‘crea[n]’’ in ‘‘crea[n] dress.’’ FUL accounts for the asym- metry by fully specifying the place of articulation for /m/ in ‘‘cream,’’ which conflicts with other places of articulation. A similar point can be made for [CORONAL] underspecifica- tion of vowels particularly in vowel harmony languages like Finnish (van der Hulst & van de Weijer, 1995). An unre- solved question is whether the human brain indeed uses not only specified but also underspecified mental repre- sentations during speech recognition. An ideal methodology for this purpose is mismatch negativity (MMN), an automatic change detection re- sponse in the brain, which has been shown to be an index of experience-dependent memory traces and to be sensitive to language-specific phoneme representations (Phillips et al., 2000; Winkler et al., 1999; Na ¨a ¨ta ¨nen et al., 1997; Na ¨a ¨ta ¨nen, 2001) and representations for words (Shtyrov & Pulvermu ¨ller, 2002; Pulvermu ¨ller et al., 2001). MMN is elicited by infrequent, deviant stimuli presented after a random number of frequent, standard stimuli. The standard stimuli create a central sound representation that is more abstract than the sum of perceived acoustic elements and correspond to the in- formation content of the sound perception, the sensory memory, and the long-term memory (Na ¨a ¨ta ¨nen, 2001; Cowan, 1999). That means that the central sound repre- sentation corresponds in part to the long-term memory traces and may thus convey information about the University of Konstanz D 2004 Massachusetts Institute of Technology Journal of Cognitive Neuroscience 16:4, pp. 577–583

Transcript of Neurobiological Evidence for Abstract Phonological ... · Neurobiological Evidence for Abstract...

Page 1: Neurobiological Evidence for Abstract Phonological ... · Neurobiological Evidence for Abstract Phonological Representations in the Mental Lexicon during Speech Recognition Carsten

Neurobiological Evidence for Abstract PhonologicalRepresentations in the Mental Lexicon during

Speech Recognition

Carsten Eulitz and Aditi Lahiri

Abstract

& A central issue in speech recognition is how contrastivephonemic information is stored in the mental lexicon. Theconventional view assumes that this information is closely re-lated to acoustic properties of speech. Considering thatno word is ever pronounced alike twice and that the brainhas limited capacities to manage information, an opposingview proposes abstract underspecified representationswhere not all phonemic features are stored. We examinedthis proposal using event-related brain potentials, in particularmismatch negativity (MMN), an automatic change detection

response in the brain that is sensitive to language-specificphoneme representations. In the current study, vowel pairswere presented to subjects, reversed as standard and deviant.Models not assuming underspecification predict equal MMNsfor vowel pairs regardless of the reversal. In contrast, en-hanced and earlier MMNs were observed for those condi-tions where the standard is not phonologically underspecifiedin the mental representation. This provides the first neuro-biological evidence for a featurally underspecified mentallexicon. &

INTRODUCTION

The mental lexicon is a part of the declarative memorycontaining all information necessary for efficient recogni-tion of speech in humans. A vital issue in linguistics andbrain research is the nature of the mental representationinvolved in the processing of sounds of natural language.From a linguistic point of view, not all features that can beextracted from the acoustic signal are required for recog-nizing speech sounds. Consequently, all predictable andnondistinctive information can be excluded from themental lexicon (Kiparsky, 1993; Lahiri & Marslen-Wilson,1991; Lahiri & Reetz, 2002; Keating, 1988). The under-specified approach, spelled out in the featurally under-specified lexicon (FUL) model (Lahiri & Reetz, 2002),serves to explain how the perceptual system handlesacoustic variance of single words across speakers andcontexts beyond the possibility of merely storing all vari-ance (Bybee, 2001). A word like ‘‘ten’’ can be pronouncedas ‘‘tem’’ in ‘‘te[m] bags’’ or ‘‘teng’’ in ‘‘te[w] gates’’(Fitzpatrick &Wheeldon, 2002). FUL assumes just a singlerepresentationforall thevariants,namely ‘‘ten,’’withan/n/that is not specified for its place of articulation, which is[CORONAL]. This underspecified /n/ does not conflict withthecontextualvariants[n], [m],and[w].Suchanequivocalrepresentation is, however, not the case for all sounds.Whereas words ending with /n/ have contextual variants,

words ending with /m/ like ‘‘cream’’ (place of articulation[LABIAL]) will not become ‘‘crea[w]’’ in ‘‘crea[w] car’’ or‘‘crea[n]’’ in ‘‘crea[n] dress.’’ FUL accounts for the asym-metry by fully specifying the place of articulation for /m/ in‘‘cream,’’ which conflicts with other places of articulation.A similar point can be made for [CORONAL] underspecifica-tionof vowelsparticularly invowelharmony languages likeFinnish (van der Hulst & van de Weijer, 1995). An unre-solved question is whether the human brain indeed usesnot only specified but also underspecified mental repre-sentations during speech recognition.

An ideal methodology for this purpose is mismatchnegativity (MMN), an automatic change detection re-sponse in the brain, which has been shown to be anindex of experience-dependent memory traces and to besensitive to language-specific phoneme representations(Phillips et al., 2000; Winkler et al., 1999; Naatanen et al.,1997; Naatanen, 2001) and representations for words(Shtyrov & Pulvermuller, 2002; Pulvermuller et al.,2001). MMN is elicited by infrequent, deviant stimulipresented after a random number of frequent, standardstimuli. The standard stimuli create a central soundrepresentation that is more abstract than the sum ofperceived acoustic elements and correspond to the in-formation content of the sound perception, the sensorymemory, and the long-term memory (Naatanen, 2001;Cowan, 1999). That means that the central sound repre-sentation corresponds in part to the long-term memorytraces and may thus convey information about theUniversity of Konstanz

D 2004 Massachusetts Institute of Technology Journal of Cognitive Neuroscience 16:4, pp. 577–583

Page 2: Neurobiological Evidence for Abstract Phonological ... · Neurobiological Evidence for Abstract Phonological Representations in the Mental Lexicon during Speech Recognition Carsten

phonological representation in the mental lexicon, whichis, in linguistic terms, the underlying representation. Thepercept created by the infrequent, deviant stimulus ismore low level and has vowel specific information avail-able around 100 msec after stimulus onset (Obleser &Eulitz, 2002; Obleser, Elbert, Lahiri, & Eulitz, 2003; Poep-pel et al., 1996; Eulitz, Diesch, Pantev, Hampson, & Elbert,1995). This percept corresponds in part to the set ofphonological features extracted from the acoustic signal,the so-called surface form. In this way the MMN can beused to study the difference between the surface form,extracted from the deviant, and the underlying represen-tation, created by the standard.

Our standard and deviant stimuli were three Germanvowels [e], [ø], and [o]. The acoustic differences be-tween [e] and [ø] were similar to the difference between[ø] and [o]. The relevant phonological features (cf.Ghini, 2001; Halle, 1992; Lahiri & Evers, 1991) in thesurface form are given in Figure 1. According to FUL, notall of these features are specified in the mental repre-sentation. The surface forms and mental representationsare thus asymmetric for [CORONAL] vowels but symmetricfor [DORSAL] vowels. This difference in symmetry and thefact that the place features [CORONAL] and [DORSAL] aremutually exclusive for vowels in any natural language(Lahiri & Evers, 1991) was exploited in the followingway: Where features from the surface form and themental representation are identical or not mutuallyexclusive, there is no conflict between the surface formand the representation. On the other hand, where theyare mutually exclusive, for example, [CORONAL] and[DORSAL], the two features are in conflict. Depending

on the vowels chosen as deviant and standard stimuliin an oddball paradigm, the mapping of the surfacefeatures to the representation results in a conflict ornonconflict situation. The results of the mapping pro-cess are thus asymmetric as well and give clear predic-tions regarding the expected MMN.

Our hypothesis is that the MMN would be of a highermagnitude and show an earlier peak latency if themapping involves a conflict rather than a nonconflictsituation (Naatanen & Alho, 1997). The critical compar-ison is between /o/ and /ø/: A conflict occurs when /o/ isthe standard and /ø/ the deviant, but not vice versa. Theconflict occurs when the feature [CORONAL] extractedfrom the deviant stimulus [ø] maps onto the feature[DORSAL] in the mental representation created by thestandard /o/, which will be referred to as [ø]/o/. Incontrast, there is no such conflict with the inversemapping, when the feature [DORSAL] from the deviant[o] maps onto the underspecified representation of thestandard /ø/, referred to as [o]/ø/ (see also lower part ofFigure 1). In a parallel argument, we do not predict dif-ferential MMN when we consider the vowels /ø/ and /e/.The acoustic difference between these two vowels isabout the same as between /ø/ and /o/. Nevertheless,there is no conflict in this pair of inversion, that is, in[ø]/e/ and [e]/ø/. That is, for both vowels used asdeviants, the feature [CORONAL] is extracted and doesnot conflict with the respective mental representationscreated by the standards. Hence, no difference in MMNis expected in this pair of inversion.

In contrast, the conventional models assume that theunderlying representation of speech sounds corre-

Figure 1. Acoustic andphonological characteristics

of the natural German

vowels used. The upper part

shows the locations of allvowels in the F2–F3 space.

The lower part lists the

corresponding sets of

phonological features(except for the common

feature [VOCALIC])

extracted from the incoming

stimulus, that is, the surfaceform, and those in the

underlying mental

representation, assuming afeaturally underspecified

lexicon. The arrows

illustrate the main statistical

model. Green arrowsindicate combinations of

standard and deviant stimuli

in nonconflicting conditions

and the orange arrow theconflicting condition.

578 Journal of Cognitive Neuroscience Volume 16, Number 4

Page 3: Neurobiological Evidence for Abstract Phonological ... · Neurobiological Evidence for Abstract Phonological Representations in the Mental Lexicon during Speech Recognition Carsten

sponds directly to their acoustic signal properties (Klatt,1989) or to closely related surface forms (Steriade, 2000;Ohala & Ohala, 1995). In that case, words are storedwith all phonetic and redundant details, including vari-ant pronunciations, such that all mental representationsare fully specified (Bybee, 2001). These models makedistinct predictions from FUL. For them, both inversionpairs of standard and deviant stimuli would have exactlyanalogous acoustic differences as well as levels of con-flict, and no MMN difference between [ø]/o/ and [o]/ø/ orbetween [ø]/e/ and [e]/ø/ would be expected. The distinctpredictions of the two opposing views on mental repre-sentation were tested by neurobiological means.

RESULTS

As shown in Figure 2, there is a clear MMN componentin the grand average difference waveforms in all exper-imental conditions. Topographical information aboutthe MMN is displayed in the lower part of Figure 2.Both the amplitude maps as well as the current source

density (CSD) maps show typical MMN topographies(Naatanen, 2001) with a predominant influence of leftand right hemispheric temporal generators on the MMN.Based on the maps, the prominent differences betweenexperimental conditions were found in amplitude andnot in topography. Consequently, we restricted furtheranalyses to the difference between Fz and linked mas-toids as a transparent parameter for the integrated brainactivity related to the ongoing change detection pro-cesses. The MMN components in the upper part ofFigure 2 show a larger and earlier peak in the conditionswhere there is a conflict between surface form andunderlying representation. This difference predicted bythe FUL model was paralleled by similar MMNs in thepair of inversion with the two nonconflicting conditions.This difference pattern was evident for the frontalelectrode position as well as for the root mean square(rms). Mean latencies and amplitudes of the MMN aresummarized in Table 1. Two-way ANOVA revealed sta-tistically significant interactions for the MMN latency,F(1/11) = 10.53; p < .01, the MMN amplitude at the

Figure 2. MMN waveforms

(upper part) and the

corresponding MMN

topographies (lower part) for allvowel pairs reversed as

standard and deviant are shown.

Data corresponding to the main

statistical model are presentedin the first two columns. The

third pair of inversion, with a

considerably larger acousticdifference, is illustrated in the

third column. The upper row

shows the waveforms at the

frontal electrode position (Fz)and the next row, the rms

waveforms. The color of the

waveforms indicates the deviant

vowel, red for [e], blue for [ø],and black for [o]. Solid lines

indicate the conflicting

conditions and dotted lines, thenonconflicting conditions. The

superimposed rectangle

illustrates the time window

for calculating the MMNamplitudes. The lower part of

the figure shows the MMN

topographies for the

corresponding experimentalconditions. The first row

displays the spline interpolated

amplitude maps (averagereference) with a contour step

of 0.25 AV. Blue lines indicate

negative potentials, red lines

positive potentials. The secondline shows the corresponding

current source density (CSD)

maps using a contour step

of 0.025 AV/cm2.

Eulitz and Lahiri 579

Page 4: Neurobiological Evidence for Abstract Phonological ... · Neurobiological Evidence for Abstract Phonological Representations in the Mental Lexicon during Speech Recognition Carsten

frontal electrode position, F(1/11) = 5.21; p < .05, aswell as the rms amplitude, F(1/11) = 8.17; p < .05. Posthoc comparisons for the two pairs of inversion showedsignificant differences for the conflict versus nonconflictpair (all parameters p< .001) and no significance for anyof the parameters in the other pair of inversion (allp values > .47). Post hoc tests comparing the conflictingpair to the three nonconflicting conditions (A: �3, 1, 1,1) revealed significant differences as well [latency: F(1/11) = 36.02; Fz amplitude: F(1/11) = 19.38; rms ampli-tude: F(1/11) = 21.00; all p values < .005].

Using a one-way ANOVA, the third pair of inversion([o]/e/ vs. [e]/o/) revealed main effects for the MMNlatency, F(1/11) = 24.02; p < .001, and the rms MMNamplitude, F(1/11) = 5.90; p < .05, but no significancefor the amplitude difference at the frontal electrode Fz.

DISCUSSION

Our results clearly support the predictions of the FULmodel, which claims that the same vowel pair may beconflicting or not conflicting depending on which pho-nological features are specified in the underlying repre-sentation. Thus, if we take the [o]/ø/ versus [ø]/o/ pair,when /ø/ is the standard, and thus taps the underlyingrepresentation that is not specified for its place feature[CORONAL], there is no conflict with the deviant [o]’ssurface representation [LABIAL]. However, if [ø] is thedeviant, the surface representation does contain [CORO-NAL] and conflicts with the underlying representation[LABIAL] of /o/ when it is the standard, the underlyingrepresentation contains the specified feature [LABIAL].According to these assumptions, the same acoustic con-trasts trigger differential MMNs when they are reversed asstandard and deviant in conflict situations, but not whenboth conditions of the pair of inversion are nonconflict-ing. In the nonconflict cases, [ø]/e/ and [e]/ø/, the acousticdifference between standard and deviant is the same, and,correspondingly, theMMNdoes not vary. However, in theasymmetric cases [ø]/o/ versus [o]/ø/, the MMN is signifi-cantly earlier and of larger amplitude when there is aconflict ([ø]/o/) compared to when there is no conflict([o]/ø/). This MMN change in a conflict/nonconflict pairwas corroborated by the third pair of inversion when

using the latency and the rms amplitude. The reducedamplitude effect for the frontal electrode position can beexplained by ceiling effects. Obviously one of the twoparallel processes contributing to the MMN (Naatanen,2001), the acoustic change detection, was too dominantrelative to the phoneme-specific processes.

This illustrates potential methodological difficulties inrunning these kinds of experiments. There are multiplestimulus characteristics, such as frequency, intensity, ortiming (for reviews, see Naatanen, 2001; Picton, Alain,Otten, Ritter, & Achim, 2000), influencing the latencyand the amplitude of the MMN in parallel. Moreover,when using spectrally complex stimuli it is almostimpossible to control for all possible factors of influ-ence at the same time in a perfect way. Therefore, weused a number of methodological solutions goingbeyond the standard MMN designs. First, we introducedacoustic variability within the vowel categories for boththe standard and the deviant. This was done not only tosimulate more natural speech perception conditionsand to force the processing system to map the incom-ing acoustic signals onto more abstract representations,but also to avoid that the change detection as reflectedin the MMN will be based on just one or a fewparticular acoustic features, which may even be unim-portant for verbal processing. Second, critical compar-isons were made between pairs of inversion, forexample, [ø]/e/ and [e]/ø/. This guarantees that theacoustic contrast between standard and deviant is equaland just the directionality is different. To our knowl-edge there is no study showing that the MMN issensitive to the directionality of change in formantfrequencies. The approach with the inversion of therole of standard and deviant worked out in the presentstudy where the stimuli varied mainly in the F2 dimen-sion, which is the most critical acoustic parameter whenvarying the place of articulation. But, given the amountof nonlinearities in the auditory system and the acousticcomplexity of the speech sounds, it may well be thatother contrasts that are not confined to an acousticdifference in the F2 dimension (e.g., vowel height) maynot show symmetric MMNs after inversion even whenboth conditions are conflicting or both are nonconflict-ing. This possibility is, however, hypothetical in nature

Table 1. Mean Peak Latencies and Amplitudes of the MMN for all Experimental Conditions

Experimental Condition Peak Latency ± SEM (msec) Mean Fz Amplitude ± SEM (AV) Mean rms Amplitude ± SEM(AV)

[e]/ø/ 157.5 ± 3.3 �1.83 ± 0.30 0.75 ± 0.09

[ø]/e/ 158.2 ± 2.4 �1.95 ± 0.27 0.83 ± 0.11

[ø]/o/ 148.2 ± 2.7 �2.80 ± 0.31 1.17 ± 0.13

[o]/ø/ 164.0 ± 2.4 �1.77 ± 0.18 0.72 ± 0.07

[e]/o/ 149.2 ± 3.9 �2.94 ± 0.21 1.22 ± 0.09

[o]/e/ 167.0 ± 3.6 �2.66 ± 0.37 1.01 ± 0.14

580 Journal of Cognitive Neuroscience Volume 16, Number 4

Page 5: Neurobiological Evidence for Abstract Phonological ... · Neurobiological Evidence for Abstract Phonological Representations in the Mental Lexicon during Speech Recognition Carsten

and has to be investigated in further studies. Third,difference waveforms are based on within categorydifferences (see Methods). This is important becauseevent-related brain responses to different vowels showdifferent waveform morphologies that may have noth-ing to do with any change detection processes. This isat least known for the N100 component (Obleser &Eulitz, 2002; Roberts, Ferrari, Stufflebeam, & Poeppel,2000), which can be close to the MMN in latency andshows different latencies and amplitudes for differentvowels. Consequently, when calculating the within-block difference waveforms (as done in most of theMMN studies, but not here) the difference betweenstandard and deviant waveforms beyond the changedetection will be superimposed in the difference wave-forms. Last but not least, the present study shows thatthe different pairs of inversion for comparison shouldhave acoustic contrasts as similar as possible (whichwere given for two pairs of inversion in the presentstudy). This seems to be important to avoid ceilingeffects that may lead to false negative amplitude results.In sum, we recommend that this set of special meth-odological solutions should be seriously considered toenable us to come to reasonable conclusions based onMMN differences when studying the processing ofspeech sounds or other acoustically complex stimuli.

The emphasis in this study has been on place fea-tures for vowels. The results indicate that the mentalrepresentation of phonological place features of vowelsare not isomorphic with the acoustic information avail-able to the listener, in the sense that its dimensionscannot be transformed into a space that exactly mapsthe acoustics of speech signals. What is represented isdetermined by both universal linguistic principles ofunderspecification and language-specific contrasts. Thisimplies that the mental representation is more abstractand sparse than assumed by theories suggesting thestorage of all acoustic information (Bybee, 2001). It maywell be that on occasion listeners do store moreinformation in the long-term memory than demandedby the phonological system, but it is equally obviousthat an abstract representation (or at least a set ofabstract features in the central sound representation)is the only way to account for the present results. Moreneurobiological experiments on different phonologicaldimensions across languages are naturally necessary tobetter understand the functional architecture of themental lexicon.

In sum, these data show that in addition to mereacoustic changes, what influences MMN is the conflictbetween phonological features extracted from the devi-ant vowel with those specified in the mental represen-tation created by the standard. This can be interpretedas neurobiological evidence that the human brain usesphonologically underspecified mental representationsduring vowel perception. Thus, the asymmetry pre-dicted by the phonologically underspecified mental

representation of the FUL model is borne out by theneurobiological evidence.

METHODS

Stimulus Description

The standard and deviant stimuli were three differenttokens of the three German vowels [e] (as in bay), [ø](as in Goethe) and [o] (as in go), spoken by a malespeaker. Acoustically, the vowels differed in F2 and F3,the tongue height, corresponding to F1, having beenkept almost constant. The F0 (109–111 Hz) and the F1(301–316 Hz) were close to each other for all vowelcategories. As shown in the upper part of Figure 1, the[ø] and [e] had smaller F2 but larger F3 differencesthan [ø] and [o], resulting in relatively close overallacoustic differences in both pairs of stimuli. Stimuli of200-msec duration (50-msec onset and offset ramps)were presented every 700 msec with a fixed intertrialinterval binaurally via headphones. Stimuli were subjec-tively rated as equally loud and had equalized soundenergy (root mean square with equally weighted fre-quencies). Measured sound pressure levels (linearscale) were 51 dB for the [ø] and [o], and 51.5 dB forthe [e]. By using three tokens of each vowel, acousticvariability was introduced to simulate more naturalspeech perception conditions and thereby force theprocessing system to map the incoming acoustic signalsonto more abstract representations.

Subjects and Task

Twelve right-handed undergraduate students of psy-chology (50% women; aged 21–30 years) underwentan electroencephalographic (EEG) study where twovowel categories were presented as standard anddeviant stimuli in a passive oddball paradigm. TheEEG was recorded from 65 electrode positions (Elec-trocap, Germany) against Cz as a reference between0.1 and 30 Hz using a sampling rate of 500 Hz. Theelectrooculogram was coregistered and used to correctthe EEG raw data for eye movements. During therecordings subjects were reading a self-chosen bookand were asked to be quiet and to avoid excessive eyemovements. During the study, 850 standards and 150deviants per vowel category and block were presented.Standards immediately after the deviant were excludedfrom the averages. Epochs with artefacts exceeding100 AV or containing artefacts found by visual inspec-tion were discarded.

In each experimental session the three vowel catego-ries were combined in all possible pairs, with eachvowel serving as a standard as well as a deviant, result-ing in six blocks. The order of blocks was counter-balanced across subjects.

Eulitz and Lahiri 581

Page 6: Neurobiological Evidence for Abstract Phonological ... · Neurobiological Evidence for Abstract Phonological Representations in the Mental Lexicon during Speech Recognition Carsten

Data Analysis

The averaged standard waveforms of the same vowel indifferent blocks were statistically not different and there-fore grand averaged across two corresponding blocksbefore being used for further analyses. These standardswere the basis to calculate the difference waveforms thatare used to extract the MMN. As there are slight changesin topography and timing of the components of theevent-related potentials between vowels, we calculatedthe within-vowel-category differences. For instance, theMMN waveform to [e]-deviant measured in a blockwhere /ø/ was the standard was calculated as [e]-deviantminus /e/-standard and will be reported as [e]/ø/.

From the difference waveforms we derived the de-pendent variables. These were (i) the MMN latencymeasured at the maximum amplitude of the MMN atthe Fz electrode position (rereferenced against linkedmastoids) in the latency range from 90–210 msec post-stimulus onset and (ii) the MMN amplitude at Fz posi-tion (rereferenced against linked mastoids) measured asthe mean amplitude across 80 msec centered at themean MMN latency across subjects in the correspondingexperimental condition (for latencies, see Table 1). As amore general amplitude measure, we used (iii) the rmsamplitude across all electrodes, which was calculated forthe same adjusted latency windows. These parameterswere subjected to a two-way repeated measures ANOVAwith the factor pair of inversion showing an equalizedacoustic change (called later on pair of inversion), thatis, [ø]/e/ versus [e]/ø/ and [ø]/o/ versus [o]/ø/, and direc-tion of change of the F2 frequency between deviantand standard (ascending: [e]/ø/ and [ø]/o/ vs. descending:[ø]/e/ and [o]/ø/). Planned comparisons were used forpost hoc testing. The statistical model was restricted tothe two pairs of inversion with an almost equal acousticchange. The third pair of inversion ([o]/e/ vs. [e]/o/)parallels the prediction for the conflict/nonconflict pairbut shows a markedly larger acoustic difference betweenstandard and deviant and was therefore statisticallytested separately.

Acknowledgments

The research was supported by the Deutsche Forschungsge-meinschaft including the Leibniz funds. We thank B. Awiszusand A. Kupka for technical assistance, and T. Elbert, J.Fitzpatrick, A. Keil, J. Obleser, F. Plank, H. Reetz, B. Rockstroh,and L. Wheeldon for comments and discussions.

Reprint requests should be sent to Carsten Eulitz, Departmentsof Psychology and Linguistics, University of Konstanz, PO BoxD25, 78457 Konstanz, Germany, or via e-mail: [email protected].

REFERENCES

Bybee, J. (2001). Phonology and language use. Cambridge:Cambridge University Press.

Cowan, N. (1999). An embedded-processes model of working

memory. In A. Miyake & P. Shah (Eds.), Models of workingmemory: mechanisms of active maintenance andexecutive control. Cambridge: Cambridge University Press.

Eulitz, C., Diesch, E., Pantev, C., Hampson, S., & Elbert, T.(1995). Magnetic and electric brain activity evoked by theprocessing of tone and vowel stimuli. Journalof Neuroscience, 15, 2748–2755.

Fitzpatrick, J., & Wheeldon, L. R. (2002). Phonology andphonetics in psycholinguistic models of speechperception. In N. Buton-Roberts, P. Carr, & G. Docherty(Eds.), Phonological knowledge (pp. 131–159). Oxford:Oxford University Press.

Ghini, M. (2001). Asymmetries in the phonology of Miogliola.Berlin: Mouton.

Halle, M. (1992). Phonological features. In W. Bright (Ed.),International encyclopedia of linguistics (Vol. 3,pp. 207–212). Oxford: Oxford University Press.

Keating, P. A. (1988). Underspecification in phonetics.Phonology, 5, 275–292.

Kiparsky, P. (1993). Blocking in nonderived environments. InS. Hargus & E. M. Kaisse (Eds.), Studies in lexical phonology(Vol. 4, pp. 277–314). New York: Academic Press.

Klatt, D. H. (1989). Review of selected models of speechperception. In W. D. Marslen-Wilson (Ed.), Lexicalrepresentation and process (pp. 162–226).Cambridge: MIT Press.

Lahiri, A., & Evers, V. (1991). Palatalization and coronality. InC. Paradis & J.-F. Prunet (Eds.), The special status of coronals:Phonetics and phonology 2 (pp. 79–100). San Diego:Academic Press.

Lahiri, A., & Marslen-Wilson, W. D. (1991). The mentalrepresentation of lexical form: A phonological approachto the recognition lexicon. Cognition, 38, 245–291.

Lahiri, A., & Reetz, H. (2002). Underspecified recognition. InC. Gussenhoven & N. Warner (Eds.), Laboratory phonology7 (pp. 637–676). Berlin: Mouton.

Naatanen, R. (2001). The perception of speech sounds by thehuman brain as reflected by the mismatch negativity(MMN) and its magnetic equivalent (MMNm).Psychophysiology, 38, 1–21.

Naatanen, R., & Alho, K. (1997). Mismatch negativity (MMN)—The measure for central sound representation accuracy.Audiology and Neuro-otology, 2, 341–353.

Naatanen, R., Lehtokoski, A., Lennes, M., Cheour, M.,Huotilainen, M., Iivonen, A., Vainio, M., Alku, P.,Ilmoniemi, R. J., Luuk, A., Allik, J., Sinkkonen, J., & Alho, K.(1997). Language specific phoneme representationsrevealed by electric and magnetic brain responses. Nature,385, 432–434.

Obleser, J., Elbert, T., Lahiri, A., & Eulitz, C. (2003). Corticalrepresentation of vowels reflects acoustic dissimilaritydetermined by format frequencies. Cognitive BrainResearch, 15, 207–213.

Obleser, J., & Eulitz, C. (2002). Extraction of phonologicalfeatures from spoken vowels is mirrored by theMEG response. Paper presented at the 13thInternational Conference on Biomagnetism, Jena.

Ohala, J., & Ohala, M. (1995). Speech perception and lexicalrepresentation: the role of vowel nasalization in Hindi andEnglish. In B. Connell & A. Arvaniti (Eds.), Papers inlaboratory phonology IV: Phonology and phoneticevidence (pp. 41–60). Cambridge: CambridgeUniversity Press.

Phillips, C., Pellathy, T., Marantz, A., Yellin, E., Wexler, K.,Poeppel, D., McGinnis, M., & Roberts, T. (2000). Auditorycortex accesses phonological categories: An MEGmismatch study. Journal of Cognitive Neuroscience, 12,1038–1055.

582 Journal of Cognitive Neuroscience Volume 16, Number 4

Page 7: Neurobiological Evidence for Abstract Phonological ... · Neurobiological Evidence for Abstract Phonological Representations in the Mental Lexicon during Speech Recognition Carsten

Picton, T. W., Alain, C., Otten, L., Ritter, W., & Achim, A. (2000).Mismatch negativity: Different water in the same river.Audiology and Neuro-otology, 5, 111–139.

Poeppel, D., Yellin, E., Phillips, C., Roberts, T. P., Rowley, H. A.,Wexler, K., & Marantz, A. (1996). Task-induced asymmetryof the auditory evoked M100 neuromagnetic fieldelicited by speech sounds. Cognitive Brain Research, 4,231–242.

Pulvermuller, F., Kujala, T., Shtyrov, Y., Simola, J., Tiitinen, H.,Alku, P., Alho, K., Martinkauppi, S., Ilmoniemi, R. J.,Naatanen, R. (2001). Memory traces for words as revealedby the mismatch negativity. Neuroimage, 14, 607–616.

Roberts, T. P., Ferrari, P., Stufflebeam, S. M., & Poeppel, D.(2000). Latency of the auditory evoked neuromagnetic fieldcomponents: Stimulus dependence and insights towardperception. Journal of Clinical Neurophysiology, 17,114–129.

Shtyrov, Y., & Pulvermuller, F. (2002). Neurophysiologicalevidence of memory traces for words in the human brain.NeuroReport, 13, 521–525.

Steriade, D. (2000). Paradigm uniformity and thephonetics–phonology boundary. In M. Broe &J. Pierrehumbert (Eds.), Papers in laboratory phonologyV: Acquisition and the lexicon (pp. 313–334). Cambridge:Cambridge University Press.

van der Hulst, H., & van de Weijer, J. (1995). Vowel harmony.In J. A. Goldsmith (Ed.), The handbook of phonologicaltheory (pp. 495–534). Oxford: Blackwell.

Winkler, I., Lehtokoski, A., Alku, P., Vainio, M., Czigler, I.,Csepe, V., Aaltonen, O., Raimo, I., Alho, K., Lang, H.,Iivonen, A., & Naatanen, R. (1999). Pre-attentivedetection of vowel contrasts utilizes both phonetic andauditory memory representations. Cognitive BrainResearch, 7, 357–369.

Eulitz and Lahiri 583