Supplemental Information Neural Response Phase · PDF fileSupplemental Information. Neural...
Transcript of Supplemental Information Neural Response Phase · PDF fileSupplemental Information. Neural...
Current Biology, Volume 23
Supplemental Information
Neural Response Phase
Tracks How Listeners Learn
New Acoustic Representations
Huan Luo, Xing Tian, Kun Song, Ke Zhou, and David Poeppel
Supplemental Data
Supplemental Figures
Figure S1, related to Figure 2
Figure S1, related to Figure 2.
Another method to examine the cortical spatial distribution of 3-8 Hz phase reliability.
For each subject, the mean ITC difference (RefRN-RN) in the ‘phase ROI’ (dotted in
Figure 2C) was calculated in each of the 157 channels separately, and the 30 channels
having the largest values were selected. Figure S1 shows the distribution map for the
selected 30 channels across subjects, and the value in each channel indicates how
many subjects (out of 13 total subjects) had this channel within their respective 30
selected channels. A similar auditory cortex distribution spatial pattern was obtained
as that in Figure 2D.
Figure S2, related to Figure 3
Grand average ITC differences and time courses in additional time-frequency ROIs.
(A) ITC temporal frequency plots for RefRN/RN pair (left) and RefN/N pair (right).
Three additional ROIs were selected. ROI 1 (3-8 Hz, 0-300 ms) represents the ‘onset’
Figure S2, related to Figure 3
range showing a large phase reset response (Figure 3B). ROI 2 (8-12 Hz, 500-1500
ms) represents an ROI in the same temporal period of ‘phase ROI’, but in different
frequency range. ITC buildup courses in ROI 1 (B) and ROI 2 (C), both of which did
not show the ITC buildup course. (D) ROI 3 (1.5-2.5 Hz, 500-1500 ms) represents the
ROI around 2 Hz, the entrainment frequency, and revealed a sustained ITC rather than
a gradual buildup course, possibly indicating the dissociation between physical
entrainment (2 Hz here) and general memory formation process (3-8 Hz). (E)
Time-frequency plots for RefRN-RN ITC differences in control experiment that
employs noise stimuli of 1.6 s in duration. RN sounds were generated by
concatenating two identical 0.8 s noise segments and N sounds were 1.6 s (2 X 0.8 s)
running noises. Grand average (N=4) time-frequency plots for ITC differences
between RefRN and RN (left). The results were zoomed in to 1-8 Hz for better
visualization (right). Note that the RefRN-RN ITC differences still occurred in 3-8
Hz.
Figure S3, related to Figure 4
Figure S3, related to Figure 4
Response function for each of the four stimuli (RN, RefRN, N and RefN) in the
early-trial group (trial 1-12) and late-trial group (trial 13-24). As indicated by the red
box, only RefRN-RN in the late-trial group showed significant differences in the
response function.
Supplemental Experimental Procedures
Subjects and MEG Recording
Thirteen subjects with normal hearing provided informed consent before getting paid
to participate in the experiment. After a 10 min training session (one reason that we
obtained high d-prime scores even for unlearned noises), neuromagnetic signals were
recorded continuously with a 157 channel whole-head MEG system (5 cm baseline
axial gradiometer SQUID-based sensors; KIT, Kanazawa, Japan) in a magnetically
shielded room, using a sampling rate of 1000 Hz and online 1-200 Hz analog
band-pass filter, with a notch filter centered around 60 Hz. In an initial pretest before
the noise experiment, participants were presented with 1 kHz tone pips of 50 ms
duration to determine their M100 evoked responses. Twenty channels with largest
M100 response (absolute values) were selected as auditory channels for each subject
separately.
Stimuli and Experimental Design
The white noise sound stimuli were generated as sequences of normally distributed
random numbers at a sampling rate of 44.1 kHz. There were 4 stimulus conditions
(RefRN, RN, RefN, N), each of which was presented 25 trials per block at a
comfortable loudness level (~70 dB SPL). They were randomly mixed together with
the restriction that RefRN, as well as RefN, did not occur on consecutive trials. The
RN sounds were generated by seamlessly concatenating three identical 0.5 s noise
segments (another reason that we obtained high d-prime scores), thus containing
repetitions within a sound, whereas N sounds were 1.5 s running noise and did not
contain repetitions. Furthermore, one particular RN sound (RefRN) and N sound
(RefN) were randomly chosen and presented throughout a block repeatedly. Subjects
were not made aware of the repeated exposure to RefRN and RefN. In each trial,
listeners were asked to judge whether noise stimuli contained repetitions by pressing
one of the two buttons (Yes or No). Among the 13 subjects, 7 subjects finished one
experimental block, and each of the other 6 subjects accomplished four experimental
blocks, within which newly generated noise stimuli were employed.
Four subjects participated in the attention control experiment. Critically, in
addition to the original four stimulus types (RefRN, RN, N, RefN), there were two
new stimulus conditions (RN-AM, N-AM), which were RN or N noises with a 30 Hz
amplitude modulation. Each of the 6 stimulus conditions was presented for 20 trials
per block and was pseudo-randomly intermixed, with a restriction that RefRN, as well
as RefN, did not occur on consecutive trials. Subjects were asked to judge whether the
noise stimuli contained amplitude modulation by pressing one of the two buttons (Yes
or No). It is noteworthy that the RefRN, RN, N and RefN were all non-targets in this
experiment (not containing 30 Hz amplitude modulation), and therefore the observed
phase effects would indicate an implicit learning process. Each subject accomplished
three experimental blocks. The modulation index was set to 0.1 to make the task
relatively hard and force subjects to listen to the entire trial.
Four subjects participated in the experiment controlling task difficulty and
segment duration, in which RN sounds were generated by concatenating two identical
0.8 s noise segments, and N sounds were 1.6 s (2 X 0.8 s) running noises. This
experiment increased task difficulty (only one repetition within a trial instead of two
repetitions in other experiments) and examined noise stimuli with different segment
duration (0.8 s instead of 0.5 s in other experiments). On each trial, listeners were
asked to judge whether noise stimuli contained a repetition by pressing one of the two
buttons (Yes or No), and each subject completed four experimental blocks.
Data Analyses
MEG data was analyzed in MATLAB partly using functions from the EEGLAB
toolbox (Delorme and Makeig, 2004) and wavelet toolbox. For each of the 6 subjects
who finished 4 experimental blocks, all data analyses were performed separately for
each block and then averaged across blocks.
Behavioral results analysis
The detection sensitivity (d’) values for RefRN, RN, RefN and N sounds were
calculated separately in each block as follows:
)2
()(
)2
()(
)2
()(
)2
()(
Re
Re
'
Re
Re'
Re
Re
'
Re
Re'
fRNRN
fNfN
fRNRN
NN
fNN
fRNfRN
fNN
RNRN
FFzHzd
FFzHzd
FFzHzd
FFzHzd
where H and F are the hit rate and false alarm rate for each of the four stimulus type
(Hit response for RN/RefRN: Yes; Hit response for N/RefN: No). Given that the main
question in the study is to explore the influences of reoccurrence in noise memory, we
examined the effects within the same noise type (RefRN and RN, RefN and N).
Specifically, for the RN sound type, we compared the dprime between RefRN and RN,
by grouping N and RefN as the corresponding false alarm rate. For the N sound type,
RefRN and RN were combined as the false alarm rate to calculate dprime for RefN
and N. Reaction time (RT) relative to sound onset was obtained. To examine the
evolving course of behavioral performance throughout an experiment block, the d’
and RT of two consecutive trials (e.g., trial 1-2, 2-3, 3-4, etc.) were averaged in a
moving window throughout the block.
Time-frequency analysis
To assess time-frequency responses, single-trial data for each condition in each
MEG channel were transformed using the continuous complex Gaussian wavelet
transforms (Wavelet toolbox, MATLAB), with frequency ranged from 1 to 30 Hz in
steps of 1 Hz. The phase and power (squared absolute value) were extracted from the
wavelet transform output at each time-frequency point for further analysis. The
“inter-trial coherence” (ITC), measuring the phase consistency across trials under
each stimulus condition, was calculated for each time-frequency point as follows:
2121 )
)sin(
()
)cos(
(NN
ITC
N
n
nij
N
n
nij
ij
, where jin ,, are the phase at the frequency bin i
and temporal bin j in trial n, respectively.
The power response, measuring the stimulus-induced mean power across trials,
was normalized by subtracting the mean power value in the baseline range (-0.5~0 s).
Note that both ITC and power response were calculated in each MEG channel
separately and then averaged across 20 auditory channels to assess the phase
reliability and power response in auditory cortex. Differences in power and ITC
between RefRN and RN, as well as those between RefN and N, were tested for
statistical significance using two-tailed paired t tests.
To obtain the Root Mean Square waveform responses from auditory cortex, the 20
Hz low-pass filtered signals were normalized by subtracting the mean baseline values
(-0.5 ~ 0 s) in each channel separately and then root-mean-squared across 20 auditory
channels for each stimulus condition.
To assess the distribution map for the stronger phase reliability for RefRN over RN,
the ITC differences between RefRN and RN within the ‘phase ROI’ (3-8 Hz, 0.5-1.5 s)
was averaged for each of the 157 MEG channels separately, resulting in a ‘phase ROI’
distribution map for each subject.
3-8 Hz filtered response analysis
The 3-8 Hz phase and power temporal course were calculated using the Hilbert
transform of the 3-8 Hz bandpass filtered signal (two-way least-squares FIR filtering,
EEGLAB toolbox) in each trial and each channel separately. Power temporal response
was normalized by subtracting the mean baseline values and then averaged across 20
auditory channels. Similarly, ITC temporal response in each subject was calculated
across phase responses in 25 trials in each channel and then averaged across 20
auditory channels. To assess the phase reliability buildup course within experiment
block, the ITC across consecutive 6 trials throughout a block (e.g., Trial 1-6, Trial 2-7,
Trial 3-8, etc.) was calculated. The mean ITC values within the ‘phase ROI’ (0.5-1.5 s)
for each of the 20 trial groups (Trial 1-6, Trial 2-7, Trial 20-25, etc.) were averaged
across 20 auditory channels to assess the ITC evolving course throughout experiment
blocks.
Phase pattern discrimination analysis
To assess whether different RefRN (within different blocks in same subject) elicited
different 3-8 Hz pattern response, we did a phase pattern discrimination analysis
based on our previous methods [18-19], by subtracting the between-block phase
similarity (across trials for different RefRN in different blocks) from the within-block
phase similarity (across trials for same RefRN in a block) in 3-8 Hz frequency range,
in each auditory channel separately, as follows:
J
ITC
J
ITC
Discrim
J
j
acrossij
J
j
withinij
i
1
,
1
,
, where within,ijITC
are the inter-trial phase
coherence in 3-8 Hz filtered response at the temporal bin i in run j (trials from same
RefRN), respectively, and acorss,ijITC
are the inter-trial phase coherence in 3-8 Hz
filtered response at the temporal bin i in the across-group signal j (mixed trials from
different RefRNs).
Note that the between-block trial data were created to have the same number of
trials as that for within-block trial data to make their ITC comparable. The phase
pattern discrimination temporal course was then averaged across 20 auditory channels
for each subject. Same analysis was performed on phase responses to RefN sounds.
The phase discrimination values within the ‘phase ROI’ (0.5-1.5 s) were then
averaged to examine the overall phase discrimination performance. Moreover, we did
the same phase discrimination analysis on trial 1-12 (the early trial group) and trial
13-24 (the late trial group) to assess whether the discrimination ability was different
between before and after memory formation.
Response function (RF)
Based on methods in visual studies [1, 2], we calculated the cross-correlation between
the 3-8 Hz bandpass filtered MEG response waveform within the ‘phase ROI’ range
(0.5-1.5 s) in each trial, and the corresponding stimulus envelope (0.5-1.5 s), which
was first downsampled to 1k Hz. We did this for each stimulus condition separately,
as follows:
T
iii tTresponseTsoundenvtRF )().()( , where soundenvi and responsei denote the
envelope of the noise stimulus and the corresponding 3-8 Hz bandpass filtered MEG
response respectively, in trial i, for each of the stimulus conditions.
The cross-correlation course was averaged across early trials (trial 1~12) and late
trials (trial 13~24) separately to produce response functions for the early-trial group
and the late-trial group, respectively. Because MEG responses in auditory channels
have opposite polarity (e.g., channels in source and sink show positive and negative
peaks in M100 onset response respectively), we first normalized the sign of the 3~8
Hz filtered response of the 20 auditory channels, by inversing the temporal waveform
of channels showing negative onset response (averaged within 150~200 ms). The
sign-normalized response functions were averaged across 20 auditory channels per
subject. Given that previous results showed strong phase reliability only for RefRN
sounds, the response function for RN, N and RefN sounds were grouped together
(labeled as ‘Others’) for comparison.
Spatial similarity (SS)
Based on a previous published SS calculation method [3], we calculated the spatial
similarity (SS) between the phase reliability map and the M100 map, as follows:
100
100
*
)()(
Mphase
M
T
phase
AA
AASS , where A , an n-dimensional column vector (n is the number of
MEG channels here), refers to the spatial distribution map for phase reliability ( phaseA )
and for tone localizer ( 100MA ). Statistical significance of the SS was then assessed
using a randomization procedure. Specifically, we shuffled within phase reliability
map and calculated the SS between the shuffled phase map and the M100 map. An SS
randomization distribution was then constructed, from which we could determine the
p<0.01 threshold to assess the statistical significance of the original SS. Note that the
cross-subject variance has also been taken into account by performing the permutation
for each subject separately.
Supplemental References
1. Lalor, E.C., Pearlmutter, B.A., Reilly, R.B., McDarby, G., and Foxe, J.J.
(2006). The VESPA: a method for the rapid estimation of a visual evoked
potential. NeuroImage32, 1549-1561.
2. VanRullen, R., and Macdonald, J.S. (2012). Perceptual echoes at 10 Hz in the
human brain.CurrBiol22, 995-999.
3. Tian, X., and Huber, D.E. (2008). Measures of spatial similarity and response
magnitude in MEG and scalp EEG. Brain topography 20, 131-141.