Nature1997

8/17/2019 Nature1997

1/3

LETTERS TO NATURE

a 2nd formant (Hz)

2,000 1,500 1',000

Language-specific phoneme -:

representations revealed by

electric and magnetic brain

responses

Risto Naatanen*, Anne Lehtokoski*, Mietta Lennest,

Marie Cheour*, Minna Huotilainen*S, Antti livonent,

Martti Vainiot, Paavo Alku ll, Risto

J.

Ilmoniemi ,

Aavo Luuky, Juri Alliky, Janne SinkkonenaS

Kimmo Alho*

Cognitive Brain Research U nit, Depar tmc r~t t Psychologx

University o f Helsinki,

FIAT-00014,

Fillland

D e p r i r t ~ n ~ ~ r tf Phonetic-s, University of Helsinki, Finland

Biohlag Laboratory, ,i4ediral Engineeri~lgCentre, Hclsinki University Central

Hospital, Finlarrtl

4 1)cpartment of.4pplit.ii Physics, Electronics and Iniorwlation Eclrnolc~gy,

U n i ~ w s i t yf Errku, Fitilond

Acoustic5 Ltrbilratory, Helsirrk~ L 11iversityof Tcchtlology, Finland

Departrrrort qfPsychologj: University of Tartu, Estonicl

There is considerable debate about whether the early processing

of sounds depends on whether they form part of speech. Propo-

nents of such speech specificity postulate the existence of lan-

guage-dependent memory traces, which are activated in the

processing of speech -3 but not when equally complex, acoustic

non-speech stimuli are processed. Here we report the existence of

these traces in the human brain. We presented to Finnish subjects

the Finnish phoneme prototype /el as the frequent stimulus, and

other Finnish phoneme prototypes or a non-prototype (the

Estonian prototype lo /) as the infrequent stimulus. We found

that the brain s automatic change-detection response, reflected

electrically as the mismatch negativity (MMN)4-10, as enhanced

I when the infrequent, deviant stimulus was a prototype (the

Finnish 101 relative to when it was a non-prototype (the Estonian

101 . These phonemic traces, revealed by MMN, are language-

specific, as lo1 caused enhancement of MMN in Estonians. Whole-

head magnetic

recording^",'^

located the source of this native-

language, phoneme-related response enhancement, and thus the

language-specific memory traces, in the auditory cortex of the left

hemisphere.

O ur subjects were Finns and E stonians, who spe ak closely related

languages with very similar vowel structures ~'except that only

Estonians have the vowel 161, which is roughly b etween 101 (as in

'sir') an d lo/ ( as in 'sore') (Fig. la ). To determine the phonem es best

representing the vowel prototypes /el (as in 'net ' but longer in

du rat ion ), o1 and lo / ( an d also 161 for Eston ians), we asked subjects

to categorize ph oneine stimuli varying in the second-forma nt (FZ)

frequency while the F1, F3, and F4 frequencies as well as the

fundamental frequency were kept constant. Finns and Estonians

, .

~ ~ d g e dhe vowels comm on to the two languages similarly (Fig. lb ).

In a dditio n, the 101 category could also be observed in the responses

of Estonians (Fig. lb ). Finns had a gap in this position of their

response profile, su g~ es tin g hat they could perceive the stim uli

giving rise to the proto type 161 perception in Estonialis as different

from the prototype phonemes of their own language. Thus,

although perhaps forming a perceptual category, the Finns did

not consider this foreign phoneme prototypical, because they had

not experienced it in their own language.

We then presented to 13 normal-he aring Finnish subjects (aged

18-29 years, right- han ded, females and 9 males) the prototype /el

as the sta nda rd stimulus, randomly replaced by infrequent deviant

stimuli that differed from the standard stimulus only in F2 (Fig. la) .

In furth er stim ulus blocks, simple sinusoiddl tones with frequencies

1 9 1,311

I

Frequency (Hz)

el [e lo ] lo1 I81 lo1

Finns

0

I t t

t t

0

1,794 1,311

Frequency (Hz)

Figure

1

a

The arrows sh ow locat ions o f the vowel s t imu l In t i le F1-F2 spz

The standard snrnuus was the Finn ish and Est onan proto type e i ; the devl

s t imulus [e lo ] was ar , In termedia te between . e l and 101; the deviant s tm u u s

was a pro to type shared by the two languages; ihe devlant s t imulus 61 was

Intermediate betw een the Flnnlsh and Estonian prototyp es 101 and l o 1

(;

c lose l y co r respo~ ided o the Estonian prototype ;dl); the deviant stimulus

was a prototype in both languages. b The percentages of 'good phon eme' ,

:o l and 10 ' responses n F lnns (top ) and of ' good phoneme ' /e l , ' d l , Id 1

10: responses n E ston ia l~sbo t tom) n the behavo u ra task The ten s tl n -~u l :

in the judgement task are Indicated with t c k m arks. The frequency sc ales

o g a r l t h m c .

equal to these F2 frequencies were used in an otherwise identi

paradigm. Brain electric responses to these stimuli were recorc

while subjects were reading self-ch osen text un de r the instruction

ignore all auditory stimuli.

Responses to the simple tone a nd its frequency changes are sho

in Fig. 2. Th e stand ard stimuli elicited a PI-N1-P2 wavefor

while the d eviant-stimulus resvonse was dominated bv the MM

an auditory response generated by an attention-independ(

change-detection process'. As expected, the

h I M N

amp lit^

increased as a functio n of the frequ ency deviance' (Fig. 2) .

TI

the Dresent MM N to ~ h o n e m e s ccurred independently of attc

tion was supported by the 'ibsence of any subsequent sl

positivityb- (P3 or P3a; Fig. 3) .

Brain responses to ph one mes are presented in Fig. 3a. Consist,

with results indicating that

M M N

is enhanced with increas

frequency deviation' (Fig.

2 ) ,

the MM N am plitude was, in gene

larger with greater F2 deviation from the standard stimulus /el F

3a an d 4a). In striking contrast, the deviant lo (pro totyp e) elicite

larger MM N than d id the deviant 161 (non -pro totyp e), although

8/17/2019 Nature1997

2/3

letters to nature

Figure

2

Top, the electr ic response ( f rom he f rontal Fz electrode; grand-average

waveforms o f F i nnsh sub jec ts ) o the s tandard s i nuso ida tone o f 1,940Hz ( s o l d

l ine ; equa l to the F2 o f l e i used as the s tandard s t m uu s n the present phoneme

experiments ) an d to devant s~nuso ida l ones o f 1,704,1,533. 1,311 and 851 Hz

(brokei, l ine; equal to the

F2

o f t he d e v ~ a n t t ~ m u l elo]. /dl , 161, and lo:.

respec tve ly ) . The s tandard-s t imu lus response shows rea tve l y smal l P I , N1 and

D 7

- Ic f icc t ions . The negat ive displacement of the de van t-s t~m ulus espons e

rea tve to the s tandard-s timu lus response is caused by the mismatch neg atv~ t y

( M M N ) , wh i ch can be seen by subt rac tng the s tandard-s t~mulusrace f rom that

fo r the devian t st imu lus (bo ttom) . M M N s tead il y nc reases i n ampli tude w ~ t hhe

Inc reas ing ma gntu de o f f r equency change

deviated acoustically less from /el than did 161. Thus MMN was

enhance d in size for prototype deviants from that attributable to the

mere acoustic difference. In contrast to the hIMN amplitude, the

M M N latency was an orderly function of the m agnitude of acoustic

deviation, with the pro totypicality of the deviant st in ~u lu s aving

no effect (Fig. 4b).

The dev iant 161, although not a phon em e in the Finnish language,

is a phon em e in Estonian. Thus, in Estonians, the M M N amplitude

should n ot dro p fro m deviant 101 to deviant lo /. To verify this,

11

normal-he aring Estonian subjects (19-31 years old, right-handed ,

females and 5 males) were subjected to the same experimental

paradigm as the Finns. The dro p in MM N amplitude from lo1 to 161

seen in Finns did not occur in Estonians (Figs 3b and 4a), which

must be du e to Id/ being a prototype in Estonian bu t no t in Finnish.

A two-wa y ANOVA (lang uage versus deviants 101 and 161) showed a

significant interaction term

F 1.22)=

11.16,

P 0.005),

indicat-

ing that the h lM N ampli tude as a function of deviance was different

in Estonians a nd F inns. This language-specific prototypicality effect

suggests the existence of neu ral traces of language-specific ph on em e

representations.

Magnetic responses to the deviant stimuli (MMNM, the mag-

netic equivalent of the electric MM N) were recorded from 9 of the

Finnish subject^^^ ^'^. The pa ttern of responses over the left hemi-

sphere was the same (the dim inished 161 response) (Fig. 4c, d) as

that shown by the electric responses (Fig. 4a). Furthermore, the

magnetoencepllalogram showed that MhsINkl was larger in

the left than in the right hemisphere when the deviant stimulus

was a proto type (left, 34 m

;

right, 22 m -

;

ANOVA,

laterality X prototypicality interaction,

F 1.8) =

6.27,

P 0.05;

see Fig. 4d, top). The equivalent current dipole ^ of the left-

hemisphere response showed that hlMNM originated at the left

auditory cortex and, further, that its dipole moment (strength)

was considerably greater for all prototype deviants than for the

non -proto type deviant 161 (see Fig. 4c, d) . The right-hemisphe re

responses were not strong enough for LLS t o d e t e rn ~ i n e h e

corresponding equivalent current dipole for each subject. As

was the case with the electric MMN, the MMNM latency

steadily decreased with increasing acoustic deviation, there

being no prototypicality effect.

Thus a vowel prototype of the subject's native language presented

as the deviant stimulus elicited a considerably larger MMN

(M M NM ) tha n could have been expected on the basis of the

acoustic (F2) distance from the frequent, standard phoneme

alone. That is, the

M M N

amplitude was enhanced when the deviant

st imulus was a prototype as opposed to an equal ly complex n o r

prototype. F urthermore, this MM N-am pli tude enhancement pat-

tern reflected the native-language phonology: when the deviant

stimulus was a pho nem e prototype in Estonian, but no t in Finnish,

the e nhancem ellt occurred in Estonian subjects but not in Finns. tl ie

therefore suggest that, in principle, ou r MM N paradigm could help

to unravel the experience-dependent neural memory traces for

phonemes of ally given language.

We found cortical memory traces of speech sounds. These traces

probably serve as recognitioll patter ns for speech soun ds, an d

allow the correct perception of a given language: they are a ctivated

by correspo nding ( an d roughly corresponding, the category effect)

speech sounds. These recognition patterns presumably develop

gradually with exposure to the language; behavioural data suggest

they develop within the first year of life1'.

The left-hemisphere dominance of the M MN hl enhancement to

pho nem e prototypes (Fig. 4c, d ) suggests that the neural p hon em e

traces are located in the left auditory cortex. That the locus of the

mem ory t race is indeed indicated by the locus of MM N ( hIh tN M )

generation is supported by the feature-specific loci of MMN

generation in the auditory cortex1'- . In contrast to the left-hemi-

sphere results, in the right auditory cortex both prototypes an d no n-

prototypes elicited a small MhINM of a similar size that resembles

the response elicited by non-prototypes in the left auditory cortex.

Thus the magnetic results indicate that the left auditory cortex is

involved in phonemic discrimination, presumably because the

neural phoneme traces are located there, whereas both the left

and right aud itory cortices serve acoustic discrimination.

Methods

Stimuli

T h e

F2

frequency for the semi-synthetic standa rd stimulus was chosen

so that it produced the

best

exemplar representing /el (pro totyp e),as judged

bv

the subjects when the

F2

frequency was changed in

4O/o

steps. The first

i F 1 ) .

third

( F 3 ;

and four th

(F4)

fonn ants were constant

a t 450 2,540

and

3,500 Hz,

respectively.

The

average fundame ntal frequency

(FO)

of each semisynthetic

vowel was 105 Hz . \'owels were genera ted

by

the produc tion of synthetic st imuli

from n atura l glottal excitation in con junc tion with

a

vocal-tract inodel ~ . Thir

method is based

o n

a speech-production model assuming that speech

15

produced as a cascade of three independent processes: the glottal excitation,

a

Finns

Figure

3

The am p~ tu de f the mismatch r~egat l v l ty MM N, f ron ta l Fz e lec t rode,

grand-average dev iant-s tandard d~f ferencewaveforms) ref lec ts the language-

spe c f i c phonem e ca tegor ies o f the F~n n i sh nd Es ton~ananguages. N ote that

the M M N ampl tud e fo r the deva nt , d l , a non-pro to type i n F inn i sh ,

IS

clearly

sm ale r th an that for the adjacent prototype dev iants wi th F innish subjects

a )

but

not w t l i Es tonian sub jec ts b) or whom 16 w as a pro to type.

N TURE \'OL 385

3 0

JANUARY 1997

8/17/2019 Nature1997

3/3

4 inns

N

stonians

Right

MMNm

-

-* Finns

Finnish subject

RH

Figure4 a

MMN peak amplitude (a t Fz) in Finns (blue)and Estonians (purple) a s

a function of the deviant stimulus, arrang ed in the or de r of increasing F2

difference from the sta ndar d stimulus. Bars indicate s.e.m. b MMN ( at Fz) (solid

lines) and MMNM (left hem isphe re) (broken line) peak laten cies a s a function of

the deviant stimulus b r Finns (blue) and Estonians (purple).c, Strength of the

equivalent current dipole (ECD; average of

9

Finnish subject s) modelling the left

audi tory-cortex MMNM for the different devian t stimuli. d Left- and right-

the electric measurements. The magnetic responses were measured using

helmet-shaped, whole-head Neuromag-122 magnetometer (sampling rat

397Hz). The stimuli were delivered through plastic tubes and earpieces

60 dB above the individual hearing level. Epochs with an electro-oculogram o

magnetoencephalogram change exceeding 150 r 1,500

6

m - , respec

tively, were omitted from subsequent analysis. The locus of the neurona

activity was estimated by determining the equivalent current dipole at th

MM NM peak latency using 40-44 channels separately over the left and

rig

frontotemporal cortices.

Received 12 Auausc accepted 2 December 1996.

I . Kuhl, P. K. Percrpl. Pryrhophys.50,93-107 (199 1).

2. Liberman, A.

M.,

Hams, K. S., Hoffman, H. S. Griffith, B. C. I.

Erp

Plychol. 54,358-368 (195

3. Kuhl, P K.

I.

Phoner. 21, 125-139 (1993).

4. Nlltlnen, R Gaillard, A.

W

K. Mintysalo, S Acia PsychoL 42,313-329 (19 78) .

5. NUtinen, R. Ear Heor. 16.6-18 (199 5).

6. Nutinen, R. Allention and Brain Funciion (Lamence Erlbaum, Hillsdale, NJ, 1992).

7. Tiiti nen, H., May, P., Reinikainen, K. Nil tan en, R. Nar~rre 72,90-92 (1994).

8. Aaltonen,O.. Eerola.O., ~ell strii m,A .,Unispaikka,

E.

Lank A. H. I.Amusr. SOE Ant. (in the press

9. Aulankc, R., Hari, R., Lounasmaa, O., Nait lnen, R., Sa m , M. NeuroReporff 1356 -1358 (199

10. Kraus, N. eta . Science 273,971-973 (199 6).

11. Hari, R. in Auditory EvokedMagnelicFieldsand Electric Polentialr. Advances in A~tdiology ol. 6 (e

Grandori, F., Hoke, M. Romani, G. L. 222-282 (Karger, Basel, 1990).

12. Ham ainen , M., Hari,R, Ilmoniem i, R. I. Knuutila,

J.

Lounasmaa, 0 .

V.

RN Mod. Phys. 65,413

497 (1993).

13. Wiik, K. AIIII. Univ. Turku.94, 1-192 (1965).

14. limnen. A. Phonerica52.221-224 (199 5).

15.

Liiv, G. Remmel, M.

Sov.

Fenno-Ugric Studies

V l

, 7-23 (1 970).

16. Nllthen, R., Umoniemi,

R. I.

Alho, K. Trends Neurosci. 7,38 9-39 5 (199 4).

17. Nllt Hnen . R. Alho , K. Elrmoenuph . Clin. NeurophysioL (suppl.)

44

179-185 (1995).

18. Kuhl, P. K., Will iams , K. A. Lacerda, F ScienceZ55, M)6-M)8 (199 2).

19. Giard,M.

H.

el aL

1.

Cogn. Neurowi. 7, 133-143 (199 5).

20. Alho, K. et a/. Psychophysiology 33,3 69-3 75 (19 96) .

21. Levhen,

S.

Ahonen,

A

Hari, R., McEvoy, L. Sam, M.

Cercb

Cortex6 288-296 (19 96)

22. Fant, G. ntc Amtistic Theory ofspeech Producrion (Mouton, Hague, 1960).

23. Alku, P Spcrch Comrntm. 11, 109-118 (1992).

24. Knuutila,

L

E. T. er a/. l Trans. Magnet. 29, 3315-3320 (199 2).

hemisphere MMNMs of on e typical Finnish subject for deviants /o / and

/ j/

Acknowledgements.WethankG.Nyman H.Tutinen.P.Laurinen E.Se~ceandJ.Kek~nif~rc

presented n

(spacing

mcm

,) maps of the

magnetic ield-gradient

on previo us versions of this manuxript, and T. Rinnefor technical support. This work was supported t

the Academy of Finland.

amplitude at the MMNM peak latency. Th e squ are s indica te the arrangement of

the

magnetic

sensors.

The arrow

representECDs indicatingactivity

n

the

Corresponden ce and requests for materials should be addressed to R.N. (e-mail: risto.naatanen

helrinki.fi). Further material, includina the sound stimuli, are available at http:l/wmv.psych.helsin

auditory cortex; the black dots in these a rr o w sh ow the cen tres of gravity of

hl~aprrslnafure3/.

MMNM. Note that the prototype 161elic itsa much larger MMNM in the leftthan in

the right hemisphere, wher eas non-prototype 161 resp on ses in both hemi-

sp he re s are small and q uite similar in amplitud e.

the vocal tract, and th e lip-rad iatio n effect?. In this way we could accurately

adjust the formants of the stimuli without mak ing sound quality too artificial.

The excitation process was computed from a m ale voice (vowel la/, normal

pho natio n) using a n autom atic inverse filtering technique2'. The vocal tract was

modelled using an eighth-order all-pole filter with coefficients adjusted t o shift

the second formant. The lip-radiation effect was modelled with a fixed

differentiator.

Behavioural ask. Th e stimuli, presented at 1,400-ms onset-to-onset intervals,

were 75 dB in intensity and 400 ms in du ration (with rise and fall times of 10 ms

each). Each stimulus was presented about 30 times. Finnish subjects were

instructed to press the /el, 101 or lo t but ton when the stimulus sounded as a

good representative of that category. For Estonians, there was also a response

button for their 161.

Electric measurements. Measurements were performed in an acoustically

and electrically shielded room. Blocks of the vowel /el were binaurally

presented (75 dB, with a d uratio n of 400 ms includ ing 10-ms rise and fall times)

through headph ones as thesta ndard stimuli, 15 of the stimuli were randomly

appearing deviant soun ds (one type per block). The onset-to-onset inter-

stimu lus interval was 900111s. The nose-referenced electroencephalogram

(EEG, samp ling rate 250 Hz ) was averaged and digitally filtered (passband 1-

30H z). Epochs with artefacts exceeding 100 pV at any electrode o r with a n

electro-oculogram (EOG) response exceeding 15 0p V were discarded. The

baseline for the waveforms was defined as the mean am plitude between - 0

and 0 ms relative to stimulus onset. The MMN am plitude was determined from

the Fz electrode (where it was usually largest) separately for each subject as the

mean am plitude of the 150-ms period centred at the largest peak between 100

and 240 ms.

Magnetlc measurements. The experimental paradigm was the same as for

VOL

Nature1997

Documents

Transcript of Nature1997