Supplemental Information Neural Response Phase · PDF fileSupplemental Information. Neural...

Current Biology, Volume 23

Supplemental Information

Neural Response Phase

Tracks How Listeners Learn

New Acoustic Representations

Huan Luo, Xing Tian, Kun Song, Ke Zhou, and David Poeppel

Supplemental Data

Supplemental Figures

Figure S1, related to Figure 2

Figure S1, related to Figure 2.

Another method to examine the cortical spatial distribution of 3-8 Hz phase reliability.

For each subject, the mean ITC difference (RefRN-RN) in the ‘phase ROI’ (dotted in

Figure 2C) was calculated in each of the 157 channels separately, and the 30 channels

having the largest values were selected. Figure S1 shows the distribution map for the

selected 30 channels across subjects, and the value in each channel indicates how

many subjects (out of 13 total subjects) had this channel within their respective 30

selected channels. A similar auditory cortex distribution spatial pattern was obtained

as that in Figure 2D.


Grand average ITC differences and time courses in additional time-frequency ROIs.

(A) ITC temporal frequency plots for RefRN/RN pair (left) and RefN/N pair (right).

Three additional ROIs were selected. ROI 1 (3-8 Hz, 0-300 ms) represents the ‘onset’


range showing a large phase reset response (Figure 3B). ROI 2 (8-12 Hz, 500-1500

ms) represents an ROI in the same temporal period of ‘phase ROI’, but in different

frequency range. ITC buildup courses in ROI 1 (B) and ROI 2 (C), both of which did

not show the ITC buildup course. (D) ROI 3 (1.5-2.5 Hz, 500-1500 ms) represents the

ROI around 2 Hz, the entrainment frequency, and revealed a sustained ITC rather than

a gradual buildup course, possibly indicating the dissociation between physical

entrainment (2 Hz here) and general memory formation process (3-8 Hz). (E)

Time-frequency plots for RefRN-RN ITC differences in control experiment that

employs noise stimuli of 1.6 s in duration. RN sounds were generated by

concatenating two identical 0.8 s noise segments and N sounds were 1.6 s (2 X 0.8 s)

running noises. Grand average (N=4) time-frequency plots for ITC differences

between RefRN and RN (left). The results were zoomed in to 1-8 Hz for better

visualization (right). Note that the RefRN-RN ITC differences still occurred in 3-8

Hz.



Response function for each of the four stimuli (RN, RefRN, N and RefN) in the

early-trial group (trial 1-12) and late-trial group (trial 13-24). As indicated by the red

box, only RefRN-RN in the late-trial group showed significant differences in the

response function.

Supplemental Experimental Procedures

Subjects and MEG Recording

Thirteen subjects with normal hearing provided informed consent before getting paid

to participate in the experiment. After a 10 min training session (one reason that we

obtained high d-prime scores even for unlearned noises), neuromagnetic signals were

recorded continuously with a 157 channel whole-head MEG system (5 cm baseline

axial gradiometer SQUID-based sensors; KIT, Kanazawa, Japan) in a magnetically

shielded room, using a sampling rate of 1000 Hz and online 1-200 Hz analog

band-pass filter, with a notch filter centered around 60 Hz. In an initial pretest before

the noise experiment, participants were presented with 1 kHz tone pips of 50 ms

duration to determine their M100 evoked responses. Twenty channels with largest

M100 response (absolute values) were selected as auditory channels for each subject

separately.

Stimuli and Experimental Design

The white noise sound stimuli were generated as sequences of normally distributed

random numbers at a sampling rate of 44.1 kHz. There were 4 stimulus conditions

(RefRN, RN, RefN, N), each of which was presented 25 trials per block at a

comfortable loudness level (~70 dB SPL). They were randomly mixed together with

the restriction that RefRN, as well as RefN, did not occur on consecutive trials. The

RN sounds were generated by seamlessly concatenating three identical 0.5 s noise

segments (another reason that we obtained high d-prime scores), thus containing

repetitions within a sound, whereas N sounds were 1.5 s running noise and did not

contain repetitions. Furthermore, one particular RN sound (RefRN) and N sound

(RefN) were randomly chosen and presented throughout a block repeatedly. Subjects

were not made aware of the repeated exposure to RefRN and RefN. In each trial,

listeners were asked to judge whether noise stimuli contained repetitions by pressing

one of the two buttons (Yes or No). Among the 13 subjects, 7 subjects finished one

experimental block, and each of the other 6 subjects accomplished four experimental

blocks, within which newly generated noise stimuli were employed.

Four subjects participated in the attention control experiment. Critically, in

addition to the original four stimulus types (RefRN, RN, N, RefN), there were two

new stimulus conditions (RN-AM, N-AM), which were RN or N noises with a 30 Hz

amplitude modulation. Each of the 6 stimulus conditions was presented for 20 trials

per block and was pseudo-randomly intermixed, with a restriction that RefRN, as well

as RefN, did not occur on consecutive trials. Subjects were asked to judge whether the

noise stimuli contained amplitude modulation by pressing one of the two buttons (Yes

or No). It is noteworthy that the RefRN, RN, N and RefN were all non-targets in this

experiment (not containing 30 Hz amplitude modulation), and therefore the observed

phase effects would indicate an implicit learning process. Each subject accomplished

three experimental blocks. The modulation index was set to 0.1 to make the task

relatively hard and force subjects to listen to the entire trial.

Four subjects participated in the experiment controlling task difficulty and

segment duration, in which RN sounds were generated by concatenating two identical

0.8 s noise segments, and N sounds were 1.6 s (2 X 0.8 s) running noises. This

experiment increased task difficulty (only one repetition within a trial instead of two

repetitions in other experiments) and examined noise stimuli with different segment

duration (0.8 s instead of 0.5 s in other experiments). On each trial, listeners were

asked to judge whether noise stimuli contained a repetition by pressing one of the two

buttons (Yes or No), and each subject completed four experimental blocks.

Data Analyses

MEG data was analyzed in MATLAB partly using functions from the EEGLAB

toolbox (Delorme and Makeig, 2004) and wavelet toolbox. For each of the 6 subjects

who finished 4 experimental blocks, all data analyses were performed separately for

each block and then averaged across blocks.

Behavioral results analysis

The detection sensitivity (d’) values for RefRN, RN, RefN and N sounds were

calculated separately in each block as follows:

)2

()(

)2

()(

)2

()(

)2

()(

Re

Re

'

Re

Re'

Re

Re

'

Re

Re'

fRNRN

fNfN

fRNRN

NN

fNN

fRNfRN

fNN

RNRN

FFzHzd

FFzHzd

FFzHzd

FFzHzd

where H and F are the hit rate and false alarm rate for each of the four stimulus type

(Hit response for RN/RefRN: Yes; Hit response for N/RefN: No). Given that the main

question in the study is to explore the influences of reoccurrence in noise memory, we

examined the effects within the same noise type (RefRN and RN, RefN and N).

Specifically, for the RN sound type, we compared the dprime between RefRN and RN,

by grouping N and RefN as the corresponding false alarm rate. For the N sound type,

RefRN and RN were combined as the false alarm rate to calculate dprime for RefN

and N. Reaction time (RT) relative to sound onset was obtained. To examine the

evolving course of behavioral performance throughout an experiment block, the d’

and RT of two consecutive trials (e.g., trial 1-2, 2-3, 3-4, etc.) were averaged in a

moving window throughout the block.

Time-frequency analysis

To assess time-frequency responses, single-trial data for each condition in each

MEG channel were transformed using the continuous complex Gaussian wavelet

transforms (Wavelet toolbox, MATLAB), with frequency ranged from 1 to 30 Hz in

steps of 1 Hz. The phase and power (squared absolute value) were extracted from the

wavelet transform output at each time-frequency point for further analysis. The

“inter-trial coherence” (ITC), measuring the phase consistency across trials under

each stimulus condition, was calculated for each time-frequency point as follows:

2121 )

)sin(

()

)cos(

(NN

ITC

N

n

nij

N

n

nij

ij

, where jin ,, are the phase at the frequency bin i

and temporal bin j in trial n, respectively.

The power response, measuring the stimulus-induced mean power across trials,

was normalized by subtracting the mean power value in the baseline range (-0.5~0 s).

Note that both ITC and power response were calculated in each MEG channel

separately and then averaged across 20 auditory channels to assess the phase

reliability and power response in auditory cortex. Differences in power and ITC

between RefRN and RN, as well as those between RefN and N, were tested for

statistical significance using two-tailed paired t tests.

To obtain the Root Mean Square waveform responses from auditory cortex, the 20

Hz low-pass filtered signals were normalized by subtracting the mean baseline values

(-0.5 ~ 0 s) in each channel separately and then root-mean-squared across 20 auditory

channels for each stimulus condition.

To assess the distribution map for the stronger phase reliability for RefRN over RN,

the ITC differences between RefRN and RN within the ‘phase ROI’ (3-8 Hz, 0.5-1.5 s)

was averaged for each of the 157 MEG channels separately, resulting in a ‘phase ROI’

distribution map for each subject.

3-8 Hz filtered response analysis

The 3-8 Hz phase and power temporal course were calculated using the Hilbert

transform of the 3-8 Hz bandpass filtered signal (two-way least-squares FIR filtering,

EEGLAB toolbox) in each trial and each channel separately. Power temporal response

was normalized by subtracting the mean baseline values and then averaged across 20

auditory channels. Similarly, ITC temporal response in each subject was calculated

across phase responses in 25 trials in each channel and then averaged across 20

auditory channels. To assess the phase reliability buildup course within experiment

block, the ITC across consecutive 6 trials throughout a block (e.g., Trial 1-6, Trial 2-7,

Trial 3-8, etc.) was calculated. The mean ITC values within the ‘phase ROI’ (0.5-1.5 s)

for each of the 20 trial groups (Trial 1-6, Trial 2-7, Trial 20-25, etc.) were averaged

across 20 auditory channels to assess the ITC evolving course throughout experiment

blocks.

Phase pattern discrimination analysis

To assess whether different RefRN (within different blocks in same subject) elicited

different 3-8 Hz pattern response, we did a phase pattern discrimination analysis

based on our previous methods [18-19], by subtracting the between-block phase

similarity (across trials for different RefRN in different blocks) from the within-block

phase similarity (across trials for same RefRN in a block) in 3-8 Hz frequency range,

in each auditory channel separately, as follows:

J

ITC

J

ITC

Discrim

J

j

acrossij

J

j

withinij

i

1

,

1

,

, where within,ijITC

are the inter-trial phase

coherence in 3-8 Hz filtered response at the temporal bin i in run j (trials from same

RefRN), respectively, and acorss,ijITC

are the inter-trial phase coherence in 3-8 Hz

filtered response at the temporal bin i in the across-group signal j (mixed trials from

different RefRNs).

Note that the between-block trial data were created to have the same number of

trials as that for within-block trial data to make their ITC comparable. The phase

pattern discrimination temporal course was then averaged across 20 auditory channels

for each subject. Same analysis was performed on phase responses to RefN sounds.

The phase discrimination values within the ‘phase ROI’ (0.5-1.5 s) were then

averaged to examine the overall phase discrimination performance. Moreover, we did

the same phase discrimination analysis on trial 1-12 (the early trial group) and trial

13-24 (the late trial group) to assess whether the discrimination ability was different

between before and after memory formation.

Response function (RF)

Based on methods in visual studies [1, 2], we calculated the cross-correlation between

the 3-8 Hz bandpass filtered MEG response waveform within the ‘phase ROI’ range

(0.5-1.5 s) in each trial, and the corresponding stimulus envelope (0.5-1.5 s), which

was first downsampled to 1k Hz. We did this for each stimulus condition separately,

as follows:

T

iii tTresponseTsoundenvtRF )().()( , where soundenvi and responsei denote the

envelope of the noise stimulus and the corresponding 3-8 Hz bandpass filtered MEG

response respectively, in trial i, for each of the stimulus conditions.

The cross-correlation course was averaged across early trials (trial 1~12) and late

trials (trial 13~24) separately to produce response functions for the early-trial group

and the late-trial group, respectively. Because MEG responses in auditory channels

have opposite polarity (e.g., channels in source and sink show positive and negative

peaks in M100 onset response respectively), we first normalized the sign of the 3~8

Hz filtered response of the 20 auditory channels, by inversing the temporal waveform

of channels showing negative onset response (averaged within 150~200 ms). The

sign-normalized response functions were averaged across 20 auditory channels per

subject. Given that previous results showed strong phase reliability only for RefRN

sounds, the response function for RN, N and RefN sounds were grouped together

(labeled as ‘Others’) for comparison.

Spatial similarity (SS)

Based on a previous published SS calculation method [3], we calculated the spatial

similarity (SS) between the phase reliability map and the M100 map, as follows:

100

100

*

)()(

Mphase

M

T

phase

AA

AASS , where A , an n-dimensional column vector (n is the number of

MEG channels here), refers to the spatial distribution map for phase reliability ( phaseA )

and for tone localizer ( 100MA ). Statistical significance of the SS was then assessed

using a randomization procedure. Specifically, we shuffled within phase reliability

map and calculated the SS between the shuffled phase map and the M100 map. An SS

randomization distribution was then constructed, from which we could determine the

p<0.01 threshold to assess the statistical significance of the original SS. Note that the

cross-subject variance has also been taken into account by performing the permutation

for each subject separately.

Supplemental References

1. Lalor, E.C., Pearlmutter, B.A., Reilly, R.B., McDarby, G., and Foxe, J.J.

(2006). The VESPA: a method for the rapid estimation of a visual evoked

potential. NeuroImage32, 1549-1561.

2. VanRullen, R., and Macdonald, J.S. (2012). Perceptual echoes at 10 Hz in the

human brain.CurrBiol22, 995-999.

3. Tian, X., and Huber, D.E. (2008). Measures of spatial similarity and response

magnitude in MEG and scalp EEG. Brain topography 20, 131-141.

Supplemental Information Neural Response Phase · PDF fileSupplemental Information. Neural...

Documents

Transcript of Supplemental Information Neural Response Phase · PDF fileSupplemental Information. Neural...