Temporal Autocorrelation in Univariate Linear Modeling of FMRI...

162
OCC Rotating Basin Group 3, Cycle 2 Draft Final Report 6/30/2011 Page 1 of 162 SMALL WATERSHED ROTATING BASIN MONITORING PROGRAM BASIN GROUP 3: LOWER NORTH CANADIAN, LOWER CANADIAN, AND LOWER ARKANSAS BASINS SECOND CYCLE FINAL REPORT FY 07/08 §319(h), EPA Grant C9-996100-14 Project 3, Output 3.1.4 and FY 07/08 §319(h), EPA Grant C9-996100-14 Project 17, Output 17.1.4 Submitted By: Oklahoma Conservation Commission Water Quality Division 4545 N. Lincoln Blvd., Suite 11A Oklahoma City, OK 73105 June 2011

Transcript of Temporal Autocorrelation in Univariate Linear Modeling of FMRI...

Page 1: Temporal Autocorrelation in Univariate Linear Modeling of FMRI …pages.stat.wisc.edu/~mchung/teaching/MIA/reading/fMRI... · 2007. 3. 27. · as the Cochrane–Orcutt transformation.

tet

NeuroImage 14, 1370–1386 (2001)doi:10.1006/nimg.2001.0931, available online at http://www.idealibrary.com on

Temporal Autocorrelation in Univariate Linear Modeling of FMRI DataMark W. Woolrich,*,† Brian D. Ripley,‡ Michael Brady,† and Stephen M. Smith*

*Oxford Centre for Functional Magnetic Resonance Imaging of the Brain, †Department of Engineering Science,and ‡Department of Statistics, University of Oxford, Oxford OX3 9DU, United Kingdom

Received November 16, 2000; published online October 24, 2001

In functional magnetic resonance imaging statisti-cal analysis there are problems with accounting fortemporal autocorrelations when assessing changewithin voxels. Techniques to date have utilized tem-poral filtering strategies to either shape these autocor-relations or remove them. Shaping, or “coloring,” at-tempts to negate the effects of not accurately knowingthe intrinsic autocorrelations by imposing known au-tocorrelation via temporal filtering. Removing the au-tocorrelation, or “prewhitening,” gives the best linearunbiased estimator, assuming that the autocorrela-tion is accurately known. For single-event designs, theefficiency of the estimator is considerably higher forprewhitening compared with coloring. However, ithas been suggested that sufficiently accurate esti-mates of the autocorrelation are currently not avail-able to give prewhitening acceptable bias. To over-come this, we consider different ways to estimate theautocorrelation for use in prewhitening. After high-pass filtering is performed, a Tukey taper (set tosmooth the spectral density more than would nor-mally be used in spectral density estimation) performsbest. Importantly, estimation is further improved byusing nonlinear spatial filtering to smooth the esti-mated autocorrelation, but only within tissue type.Using this approach when prewhitening reduced biasto close to zero at probability levels as low as 1 3 1025.© 2001 Academic Press

Key Words: FMRI analysis; GLM; temporal filtering;emporal autocorrelation; spatial filtering; single-vent; autoregressive model; spectral density estima-ion; multitapering.

INTRODUCTION

In this paper we focus on the issues surroundingtemporal autocorrelations in functional magnetic res-onance imaging (FMRI) time series. These include un-derstanding the nature of the autocorrelation, the ef-fects of temporal filtering, the effects of differentexperimental designs, and ways of performing efficientand accurate statistical tests. In particular, the aim is

13701053-8119/01 $35.00Copyright © 2001 by Academic PressAll rights of reproduction in any form reserved.

to deduce an autocorrelation estimation techniquewhich gives acceptably low bias when prewhitening.

We start with a brief overview of previous work inthe area. We then set up a familiar GLM framework todefine the different strategies for dealing with autocor-relations in FMRI. Four different approaches to tem-poral autocorrelation estimation are considered and aqualitative data analysis is used to examine the way inwhich the estimated autocorrelation varies spatially,the effects of temporal filtering, and the effects of dif-ferent design types. We then introduce nonlinear spa-tial smoothing of the autocorrelation as a means toimproving the estimation further. Finally, quantitativeassessment of the bias (calibration) for the differentautocorrelation estimation techniques is performed, bycomputing null distributions from null/rest data andcomparing them with the expected theoretical distri-butions.

OVERVIEW OF PREVIOUS WORK

Friston et al. (2000) suggested that current tech-niques for estimating the autocorrelation (autoregres-sive (AR) and 1/f models where f is the frequency) arenot accurate enough to give prewhitening acceptablebias. Therefore, estimation is made more robust toinaccurate autocorrelation estimates by swamping anyintrinsic autocorrelation with the known autocorrela-tion introduced by band-pass filtering, an approachoften referred to as “coloring” (Friston et al., 1995;Worsley and Friston, 1995). For this they use a Gauss-ian (or similar) low-pass filter matched to the hemody-namic response function (HRF) and a linear high-passfilter which aims to remove the majority of the auto-correlation due to low-frequency components. Havingshaped the autocorrelation, prewhitening is then notapplicable, and the autocorrelation estimate is insteadused to correct the variance of univariate linear modelparameter estimates and the degrees of freedom usedin the GLM.

Although coloring is unbiased, given an accurateautocorrelation estimate, Bullmore et al. (1996) noted

Page 2: Temporal Autocorrelation in Univariate Linear Modeling of FMRI …pages.stat.wisc.edu/~mchung/teaching/MIA/reading/fMRI... · 2007. 3. 27. · as the Cochrane–Orcutt transformation.

daed

ptrt

1371TEMPORAL AUTOCORRELATION IN FMRI

the need for serially independent (whitened) residualsto obtain the best linear unbiased estimates (BLUE) ofthe GLM parameters. The parameter estimates are“best” in the sense that they are the unbiased esti-mates with the lowest variance. This was achievedusing pseudo-generalized least squares—also knownas the Cochrane–Orcutt transformation. The autocor-relation is estimated for the residuals from a first lin-ear model and is then used to “prewhiten” the data andthe design matrix, for use in a second linear model. Theresiduals of the second linear model should be close towhite noise. Further iterations of this process are pos-sible.

For inference, they ascertain the null distributionsusing randomization; that is, they randomly reorga-nize the order of signal intensity values in each ob-served time series and estimate the test statistic foreach. It is important to note that such randomization ofthe time series is only valid in the absence of autocor-relations, hence an accurate prewhitening step is nec-essary. Any randomization approach on nonwhite dataneeds to randomize the data in such a way that theeffective structure of the autocorrelation is main-tained, for the null distribution to be valid.

To model the autocorrelation, Bullmore proposed anAR model of order 1 (AR(1)), which was shown to modelthe autocorrelations satisfactorily for the data used intheir paper. Purdon and Weisskoff (1998) also sug-gested using an AR(1) model to do prewhitening, butthey also included a white noise component. However,the main focus of their paper was to explore the effect,on the false positive rate, of not taking into account thetemporal autocorrelation. For a desired false positiverate of a 5 0.05 they find false positive rates as high asa 5 0.16 in the uncorrected data. At a 5 0.02 thesituation worsens further, with uncorrected data giv-ing a 5 0.095. This is because any inaccuracies in thedistribution compared with the assumed theoreticaldistribution are more prominent farther down the tailof the distribution.

Locascio et al. (1997) used an autoregressive movingaverage (ARMA) model (see also Chatfield, 1996) andincorporated it into an overall contrast autoregressiveand moving average (CARMA) model. As well as theARMA and modeled experimental responses, theCARMA model contains baseline, linear, and quadraticterms for the removal of low-frequency drift. They fitseparate MA and AR models of up to order 3.

Locascio et al. (1997) also suggest that the existenceof positive autocorrelation is due to carryover from onetime point to the next, stemming from time intervalsthat are smaller than the actual temporal changes.They describe autocorrelation as the persistence ofneuronal activation, cyclical events (presumably they

are referring to aliased cardiac and respiratory cycles),or possibly characteristics or artifacts of the measure-ment process.

Zarahn et al. (1997) and Aguirre et al. (1997) ob-served 1/f noise profiles in FMRI data and as a resultattempted to use a 1/f noise model with three param-eters to account for temporal autocorrelations. Theyalso carried out a number of water phantom studies toestablish how much, if any, of the 1/f noise is attribut-able to physiological processes. They concluded thatthe same 1/f noise was apparent in the phantoms andthat therefore the noise was not of physiological origin.

They use their 1/f model of the intrinsic autocorrela-tion together with Worsley and Friston’s (1995) ap-proach of coloring the data. Zarahn proposed to refinethis by incorporating the 1/f model’s representation ofthe intrinsic autocorrelation into the autocorrelationdue to temporal filtering, in order to give a betterestimate of the autocorrelation posttemporal filtering.Unfortunately, their attempts fail because they fit the1/f model over the entire brain volume, therefore ignor-ing the potential for spatial nonstationarity of thenoise profile.

Hu et al. (1995) concentrate on the components of thecolored noise in FMRI data that are due to physiolog-ical fluctuations. By recording respiration and cardiaccycle data at the same time as the FMRI data areacquired, some of these effects can be removed. Thiscould help to reduce the autocorrelation and poten-tially improve any autocorrelation estimation subse-quently employed. However, these are not the onlycauses of colored noise in the data and hence whetheror not respiration and cardiac cycle data are available,robust strategies for dealing with autocorrelation arestill required.

METHODS

GLM Framework

In the basic GLM, Y 5 XB 1 e, Y is the observedata, X is the matrix of “regressors” (often referred tos the design matrix), and B is the parameter to bestimated. The error e is assumed to have a Normalistribution N(0, s2 V), where V is the autocorrelation

matrix for the time series. There exists (Seber, 1977) asquare, nonsingular matrix K such that V 5 KKT andthat e 5 Ke where e is N(0, s2I).

Now consider a GLM which incorporates temporalfiltering of the data, where S is the square matrix that

erforms the temporal filtering via matrix multiplica-ion. S is a Toeplitz matrix produced from the impulseesponse; this is directly equivalent to convolving withhe impulse response using zero padding. The design

Page 3: Temporal Autocorrelation in Univariate Linear Modeling of FMRI …pages.stat.wisc.edu/~mchung/teaching/MIA/reading/fMRI... · 2007. 3. 27. · as the Cochrane–Orcutt transformation.

(

w

w

2epfifibRwlids

t(ac

1372 WOOLRICH ET AL.

matrix is also temporally filtered using S to reflect theknown change in the observed data. We now have

SY 5 SXB 1 h, (1)

where h is N(0, s2SVST). We use an ordinary least-squares estimate of B, given by

B 5 ~SX)1SY, (2)

where (SX)1 is the pseudo-inverse of (SX) given bySX)1 5 ((SX)TSX)21(SX)T. The variance of a contrast c,

of these parameter estimates, B, is given by

Var$cTB% 5 keffs2

(3)keff 5 c T~SX)1SVS T~~SX)1) Tc.

Note that keff is a scalar that scales s2 by an amountthat depends upon the design matrix X, the temporalautocorrelation V, and the contrast c to give the vari-ance of the contrast of parameter estimates. For anestimate of s2 we use (Worsley and Friston, 1995; Se-ber, 1977)

s 2 5 h Th /trace~RSVST !, (4)

here R 5 I 2 SX(SX)1, the residual forming matrix,which can be used to obtain the residuals of the modelfit:

r 5 RSY. (5)

Strategies for Dealing with Autocorrelation

For the moment we assume a known autocorrela-tion matrix V 5 KKT. We investigate three ap-proaches to dealing with the autocorrelation inFMRI; these are

● coloring (e.g., Friston et al., 1995), with S 5 Ahere A is a low-pass filter, giving

keff 5 c T~AX!1AVA T~~AX!1! Tc; (6)

● variance correction (by which one corrects the sta-tistics for autocorrelation in the data, but neither col-ors the data nor goes as far as prewhitening), with S 5I giving

keff 5 c T~X!1V~~X!1! Tc; (7)

● prewhitening (e.g., Bullmore et al., 1996), whichgives the optimal BLUE and is obtained by setting S 5K21 giving

k 5 c T~XTV 21X! 21c. (8)

eff

For all of these approaches we need an accurateestimate of the autocorrelation matrix SVST. Tech-niques for calculating such an estimate are discussedin the next section.

High-Pass Filtering

The data to be considered are the FMRI time seriesat each voxel following motion correction. The rawmotion-corrected time series have a considerably col-ored noise structure, the majority of which occurs atlow frequency. Therefore, in this paper our approach isto perform high-pass filtering to remove the worst ofthe low-frequency components. This is also beneficialsince it is the low-frequency deterministic trends in thetime series which contribute most to violating an as-sumption of second-order stationarity.

High-pass filtering can be performed by incorporat-ing such things as a discrete cosine transform set intothe design matrix X or into the matrix S (Friston et al.,000). However, such techniques produce large endffects and so we prefer to use a nonlinear filter asroposed by Marchini and Ripley (2000). This approachts and removes Gaussian-weighted running lines ofxed width using a least-squares fit and was found toe a reliable method of trend removal in Marchini andipley (2000). As in Marchini and Ripley (2000), theidth of the Gaussian is chosen to be twice the cycle

ength when using boxcar or single event with fixednterstimulus interval (ISI) (Bandettini and Cox, 2000)esigns. However, for randomized ISI single-event de-igns (Burock et al., 1998; Dale and Buckner, 1997; and

Dale, 1999) the situation is not as clear. This is becausethe signal contains power at virtually all frequencies(see Fig. 10b). Hence, a compromise is used by settingthe full-width half-maximum (FWHM) to 45 scans.This removes the worst of the low-frequency trends,allowing sensible autocorrelation modeling, while re-moving negligible power from the signal. Such nonlin-ear high-pass filtering is performed as a preprocessingstep on all data sets subsequently used in this paper.

Autocorrelation Estimation

An estimate of the autocorrelation matrix SVST ofhe error h is required. We could estimate B using Eq.2) to obtain the residuals r and then estimate theutocorrelation matrix of the residuals. However, itan be shown that

r 5 RSY 5 Rh, (9)

and then it follows that the autocovariance of the re-siduals is given by

Cov$r% 5 s2RSVS TR T, (10)

Page 4: Temporal Autocorrelation in Univariate Linear Modeling of FMRI …pages.stat.wisc.edu/~mchung/teaching/MIA/reading/fMRI... · 2007. 3. 27. · as the Cochrane–Orcutt transformation.

u

1373TEMPORAL AUTOCORRELATION IN FMRI

which is not SVST. The difference between the autocor-relation of the error and the autocorrelation of the resid-uals is clearly due to regression onto the design matrixand it is impossible to unravel SVST from RSVSTRT

since R is noninvertible. This turns out to be not toomuch of a problem, as can be illustrated using an exam-ple time series generated artificially using the noise pro-cess N(0, stVt), where Vt is the autocorrelation matrix forthe typical gray matter autocorrelation in Fig. 5a. Figure1a shows the spectral density of the time series, Fig. 1bshows the spectral density of the residuals r for the sametime series after regression onto the HRF convolved box-car (shown in Fig. 7), and Fig. 1c shows the same afterregression onto the randomized ISI design (shown in Fig.10). Despite some subtle differences in the raw spectraldensity estimates the Tukey estimates are remarkablysimilar. This is perhaps surprising, particularly for therandomized ISI design, which covers a large frequencyrange (see Fig. 10b). The primary reason for this is thatthe spectral density will be affected only at the frequencyand phase of the regressors. Second, when the regressorhas high power at a particular frequency but not at itsneighboring frequencies (this is less true for the random-ized ISI design but still has some effect), then spectraldensity estimation techniques which heavily smooth thespectral density will help rectify this problem further. Alltechniques considered in this paper do effectively smooththe spectral density.

For variance correction or coloring, an estimate ofSVST can be calculated from the residuals after Eq. (2)is used to obtain the parameter estimates. This esti-mate of SVST is used in Eq. (3) to give the variance ofthe parameter estimates.

However, prewhitening requires an estimate ofSVST before the BLUE can be computed and Eq. (3)used. To get around this an iterative procedure is used(Bullmore et al., 1996). First, we obtain the residuals r

sing a GLM with S 5 I. The autocorrelation V is then

FIG. 1. (a) The spectral density of an artificially generated time sof the residuals r for the same time series after regression onto thein Fig. 10, respectively. Raw spectral density estimates are shown byare shown by the broken lines. There is no visible difference betwee

estimated for these residuals. Given an estimate of V,V21 and hence K21 can be obtained by inverting in thespectral domain (some autocorrelation models, e.g., au-toregressive, have simple parameterized forms for K21,and hence inversion in the spectral domain is not nec-essary). Next, we use a second linear model with S 5K21, and the process can then be repeated to obtainnew residuals from which V can be reestimated and soon. We use just one iteration and find that it performssufficiently well in practice. Further iterations eithergive no further benefit or cause overfitting, dependingupon the autocorrelation estimation technique used.Autoregressive model fitting procedures which deter-mine the order would do the former, nonparametricapproaches (Tukey, multitapering, etc.), the latter.

Whether for use in prewhitening or for correcting thevariance and degrees of freedom of the test statistic, anaccurate, robust estimate of the autocorrelation is neces-sary. This estimation could be carried out in either thespectral or the temporal domain—they are interchange-able. Raw estimates (Eq. (11)) cannot be used since theyare very noisy and introduce an unacceptably large bias.Hence some means of improving the estimate is required.

All approaches considered assume second-order sta-tionarity—an assumption whose validity is helped bythe use of the nonlinear high-pass filtering mentionedin the previous section. We consider standard window-ing or tapering spectral analysis approaches, multita-pering, parametric ARMA, and a nonparametric tech-nique which uses some simple constraints. The resultsof these different techniques applied to a typical gray-matter voxel in a rest/null data set are shown forcomparison in Fig. 2.

Single Tapers

A standard approach to taking raw estimates of theautocorrelation or equivalently of the spectral density

es using the noise process N(0, stVt). (b and c) The spectral densitiesF convolved boxcar shown in Fig. 7 and the randomized ISI showne solid lines, and estimation using Tukey windowing (with M 5 15)he Tukey estimated spectral densities.

eriHRth

n t

Page 5: Temporal Autocorrelation in Univariate Linear Modeling of FMRI …pages.stat.wisc.edu/~mchung/teaching/MIA/reading/fMRI... · 2007. 3. 27. · as the Cochrane–Orcutt transformation.

1374 WOOLRICH ET AL.

is to window the raw time series prior to taking aFourier Transform. This downweights points at eitherend of the time series, reducing leakage due to endeffects (Bracewell, 1978). Equivalently the raw auto-correlation estimate can be tapered such that it isdownweighted at high lags. Intuitively, this seems rea-

FIG. 2. (a) Autocorrelation and (b) spectral density, for a typicalthe solid lines, and (from top to bottom) estimates from Tukey windoand a general order AR model are shown by broken lines.

sonable since the precision of the raw autocorrelationestimates clearly decreases with high lag.

Whether windowing the time series or the raw auto-correlation estimate, the shape and size of the windowneed to be decided upon. Here, we prefer to use win-dowing of the raw autocorrelation estimate. This is

y-matter voxel in a rest/null data set. Raw estimates are shown byg with M 5 15, nonparametric PAVA, multitapering with NW 5 13,

grawin

Page 6: Temporal Autocorrelation in Univariate Linear Modeling of FMRI …pages.stat.wisc.edu/~mchung/teaching/MIA/reading/fMRI... · 2007. 3. 27. · as the Cochrane–Orcutt transformation.

t

w

rovg

sfi

bieaaspMt

L

p

1375TEMPORAL AUTOCORRELATION IN FMRI

because of considerations of spatial regularizationwhich we will come to later. For a time series x(t) for

5 1, . . . , N the raw autocorrelation estimate at lag tis given by

rxx~t! 51

s 2 Ot51

N2t

x~t!x~t 1 t!/~N 2 t!. (11)

The two favored windows in the time series litera-ture are the Tukey and Parzen windows, which appearto perform equally well (Chatfield, 1996). Hence, wearbitrarily concentrate on the Tukey window, which isdefined as

rxx~t! 5 51

2 S1 1 cosSpt

MDDrxx~t! if t , M

0 if t $ M, (12)

here M is the truncation point such that for t . M,rxx 5 0. This window smooths the spectral density byan amount determined by M.

The choice of the value for M is a balance betweeneducing the variance while minimizing the distortionf the autocorrelation/spectral density estimate. Theariance in the estimation of the spectral density isiven by (Chatfield, 1996)

Var@rxx~t!/rxx~t!# 53M

4N. (13)

Large M corresponds to less smoothing in the spec-tral domain. A rough guide in the literature is to set Mto be about 2=N (Chatfield, 1996). For N 5 200 thisgives M 5 28.

Nonparametric Estimation

Instead of presetting M to what is considered a rea-onable value, we instead apply some constraints. Therst assumption is that rxx(t) . 0 for all t. The second

assumption is that the autocorrelation is monotoni-cally decreasing. This means that low-frequency com-ponents, which are widely accepted in the literature asbeing the most important to account for, are favored.

The autocorrelation is estimated using a standardunbiased estimator (Eq. (11)) and then the best least-squares fit that satisfies the constraint of monotonicitycan be obtained using techniques from the literature ofisotonic regression. The particular algorithm that weuse is the pool adjacent violators algorithm (PAVA) (T.Robertson and Dykstra, 1988), which provides aunique, least-squares fit under the constraint. Beforeusing the algorithm, we set rxx(t) 5 txx(t) for t 5 1, . . . ,N/3 and r (t) 5 0 for t . N/3. This is done partly

xx

ecause it reduces the amount of data the algorithmterates over and also because the raw autocorrelationstimate is very noisy for t $ N/3 (there are fewer datavailable to compute autocorrelations at high lags),nd we do not expect significant autocorrelation atuch high lags. Furthermore, the value of zero willropagate, eventually stopping at the lag which gives. For the purpose of the algorithm it is also necessary

o define a weighting function w(t) 5 1 for t 5 1, . . . ,N. The algorithm then proceeds as follows:

1. If rxx(t) is not isotonic there must exist a violatorat k such that rxx(k 2 1) . rxx(k).

2. Pool these two values, by replacing them bothwith their weighted average:@rxx~k 2 1!w~k 2 1! 1 rxx~k!w~k!#/@w~k 2 1! 1 w~k!#.

(14)

3. Replace w(k 2 1) and w(k) with w(k 2 1) 1 w(k).4. Repeat until no more violators.

The algorithm was tested on artificial data whichconsisted of white noise of length N 5 200 and hadbeen low-pass filtered with a Gaussian of varying stan-dard deviation. This highlighted a slight bias for whitenoise data, which was easily remedied by settingrxx(1) 5 0 if rxx(1) , 0.1.

Multitapering

Multitapering is an extension of single-taper ap-proaches and consists of dividing the data into overlap-ping subsets that are each individually tapered andthen Fourier transformed. The individual spectral co-efficients of each subset are averaged to reduce thevariance. The way in which the data are to be subdi-vided is defined by a set of tapers indexed by l 5 1, . . . ,

; the estimated spectral density at frequency bin f isthen given by

S~f! 5¥ l50

L21 llSl~f!

¥ l50L21 ll

, (15)

where Sl(f ) is the estimated spectral density usingtaper l and ll is the weight for each tapered spectraldensity estimate.

As with the single-window approaches the spectraldensity is effectively smoothed, but without losing in-formation at the end of the time series. The windowsare chosen so that they are orthogonal and reduceleakage as much as possible. Under these require-ments the optimal choice is the discrete prolate sphe-roidal sequences or Slepian sequences (Percival andWalden, 1993). The Slepian sequence used is deter-mined by the length of the time series N and by a

arameter W, which corresponds to the half-bandwidth

Page 7: Temporal Autocorrelation in Univariate Linear Modeling of FMRI …pages.stat.wisc.edu/~mchung/teaching/MIA/reading/fMRI... · 2007. 3. 27. · as the Cochrane–Orcutt transformation.

u

gt

A

ip

Pi

[

iWva3tl

ePc

1376 WOOLRICH ET AL.

(i.e., w 5 2pW is the half-bandwidth in radians). Whensing the Slepian sequences to give Sl(f ), the weight ll

used in Eq. (15) could simply be unity or somethingmore complex. In this paper we use the eigenvalues ofthe Toeplitz matrix associated with the Fourier trans-formation as the weights (Percival and Walden, 1993).However, even these weights aren’t optimal—moreelaborate weighting such as Thomsen’s nonlinear ap-proach (Percival and Walden, 1993), which adapts tothe local variations in the spectral density, could beused instead.

As with the single-taper approaches the parameterW needs to be chosen to balance the desired reductionin variance with minimizing the distortion of the spec-tral estimate. The variance in the estimation of thespectral density is given by Percival and Walden(1993): For Slepian sequence multitapering the vari-ance in the estimate is

Var@rxx~t!/rxx~t!# 5 2NW. (16)

Hence by using Eq. (16) with Eq. (13) a Tukey taperingapproach can be compared with multitapering by set-ting the variances the same. For example, for N 5 200the recommended Tukey parameter was M 5 28. Toive the same variance requires a multitaper parame-er of NW ' 7.

utoregressive Parametric Model Estimation

Stationary stochastic time series can be modeled us-ng an autoregressive process of sufficiently high order

(AR(p)):

x~t! 5 f1x~t 2 1! 1 f2x~t 2 2!

1 · · · 1 fpx~t 2 p! 1 e~t!,(17)

where e(t) is a white noise process and f1, f2, . . . , fp

are the autoregressive model parameters. There is theoption to fit more complex ARMA models (autoregres-sive process forced by a moving average process). How-ever, AR models are often used on their own due totheir relative simplicity in fitting. It may then requiremore parameters to fit the process with no significantloss of accuracy. This is the approach considered here.

The time series literature, including Chatfield(1996), describes various techniques for determiningthe order p and parameters of AR models. Here, we usethe partial autocorrelation function (PACF) to find pand ordinary least squares to fit the parameters. Whenfitting an AR(p) model, the last partial coefficient ap

measures the excess correlation at lag p not accountedfor by an AR(p 2 1) model; ap plotted for all p is the

ACF. The lowest value of p for which ap in the PACFs not significantly different from zero (using the 95%

confidence limits of approximately 62/=N (Chatfield,1996)), is the order used.

QUALITATIVE DATA ANALYSIS

Methods

The intention here is to explore the effects of tempo-ral filtering and the spatial variation of the autocorre-lations in real FMRI data. We could attempt to exam-ine the autocorrelation, or equivalently the powerspectral density, itself. However, this would give N(number of scans or time points) data points for eachvoxel. Instead, we use

Sr 5 N/@1 1 2 Ot51

N21

rxx~t!#. (18)

This produces a single value whose variation can thenbe easily visualized. The value Sr corresponds approx-imately to 1/keff in Eq. (7) when performing a t test (c 51] and X 5 [1, . . . , 1]T) with large N.

An Sr 5 n indicates white noise and 0 , Sr , nndicates a time series with positive autocorrelation.

e examined one rest/null data set from a normalolunteer. Two hundred echo planar images (EPI) werecquired using a 3-T system with time to echo (TE) 50 ms, TR 5 3 s, in-plane resolution 4 mm, and slicehickness 7 mm. The first 4 scans were discarded toeave N 5 196 scans and the data were motion cor-

rected using AIR (Woods et al., 1993). To calculate Sr atach voxel we arbitrarily used the nonparametricAVA autocorrelation approach to estimate the auto-orrelation.

Results

Figure 3 shows a set of histograms of Sr for the entirebrain volume with the skull and background removed.With no temporal filtering, the histogram has a peak atthe low values of Sr ' 15, representing tissue with highautocorrelation, and a peak at Sr 5 196 correspondingto white noise.

High-Pass Filtering

High-pass filtering is used to remove the worst of thelow-frequency noise in the FMRI time series. Here weare using the nonlinear high-pass filtering discussed inthe previous section. In Fig. 3b a FWHM of 40 scanswas used and had the effect of removing the lower peakand pushing the whole histogram to higher values.This shift to higher Sr values represents a decrease inthe positive autocorrelation in the data; this corre-sponds to a decrease in the parameter variances, asdesired. The use of high-pass filtering reduces the low-

Page 8: Temporal Autocorrelation in Univariate Linear Modeling of FMRI …pages.stat.wisc.edu/~mchung/teaching/MIA/reading/fMRI... · 2007. 3. 27. · as the Cochrane–Orcutt transformation.

1377TEMPORAL AUTOCORRELATION IN FMRI

frequency noise and nonstationarity of the time series,making the estimation of the autocorrelation more ro-bust and valid. However, having used a nonlinear high-pass filter, the power spectral density of the time serieshas been shaped in such a way that it is not easilymodeled by a low-order parametric AR model.

Low-Pass Filtering

The low-pass filter used here is a Gaussian filtermatched to a commonly assumed HRF with parame-

FIG. 4. (a) Spatial maps of Sr in the brain volume after the nonhistogram in Fig. 3b, with high Sr displayed as lighter gray and lcharacteristic could be easily observed in five other null data sets.

FIG. 3. Histograms of Sr for a null FMRI data set with (a) no tempmatched to a Gaussian HRF followed by nonlinear high-pass filterobtained for five other null data sets from three different subjects onto white and gray matter; this can be seen in the spatial map of Sr

ters s 5 2.98 s and m 5 3 s (Friston et al., 1995). Thehistogram for low-pass filtering demonstrates the ideaof coloring, in that the whole histogram is focused to apeak centered on a Sr close to that entirely due to thelow-pass filtering. However, the standard deviationaround this peak (standard deviation 5 17.6) is greaterthan the standard deviation of the estimator of Sr

(determined empirically on artificial data as standarddeviation 5 12.9). Since the data are showing a greatervariability in Sr than there is in just estimating Sr, this

ear high-pass filtering has been performed. This corresponds to theSr as darker gray. (b) EPI for the same slices. Exactly the same

l filtering, (b) nonlinear high-pass filtering, and (c) low-pass filteringThe only preprocessing is motion correction. Similar results weresame scanner. The two apparent populations in (b) are due largelyig. 4.

linow

oraing.thein F

Page 9: Temporal Autocorrelation in Univariate Linear Modeling of FMRI …pages.stat.wisc.edu/~mchung/teaching/MIA/reading/fMRI... · 2007. 3. 27. · as the Cochrane–Orcutt transformation.

1378 WOOLRICH ET AL.

suggests the requirement for local estimation of auto-correlation even when low-pass filtering (coloring) isperformed. Although the autocorrelation estimate ismade more robust, the autocorrelation imposed by thecoloring does not completely smother the intrinsic au-tocorrelation.

Spatial Variation

Figure 4 shows four slices of the Sr values in thebrain volume after nonlinear high-pass filtering hasbeen performed, along with the original functional im-age, for comparison. Figure 4 shows considerable spa-tial variation and structure, with lower Sr correspond-ing to increased autocorrelation in the gray mattercompared to the white matter and CSF. Exactly thesame characteristic could be easily observed in fiveother null data sets.

These findings appear to contradict Zarahn’s andLund’s conclusions that the autocorrelation is not at allphysiological in origin. It may be that the greatersmoothness in the gray matter is due to a larger num-ber of edges in gray matter compared to white matter.Edges within voxels may be subject to motion of anytype (inaccurate motion correction, physiological pul-

FIG. 5. (a) Autocorrelation for a typical gray-matter voxel in aautocorrelation.

FIG. 6. (a) Gamma HRF, fG(t; a, b), has parameters set accordingfor the HRF.

sations), and this motion along with a partial volumeeffect may produce increased low-frequency noise. Fur-ther analysis is required to understand such sources ofthe autocorrelations that could exhibit these character-istics. This will be a topic for future research.

EFFECT OF DIFFERENT REGRESSORS

Methods

It turns out that the regressor used in the designmatrix considerably affects the relative efficiency be-tween coloring and prewhitening. To illustrate this weuse a typical autocorrelation estimated for a gray mat-ter voxel (shown in Fig. 5) to give a typical V matrixand use for X one of four different types of regressors ofparticular interest:

(a) a boxcar design with period 60 s;(b) a single-event design with fixed ISI of 15 s and

stimulus duration of 0.1 s (Bandettini and Cox, 2000);(c) a single-event design with stimulus duration of

0.1 s with jittering such that the ISIs are drawn from auniform distribution U (13.5 s, 16.5 s) (Josephs et al.,1997);

st/null data set, and (b) the power spectral density for the same

mean a/b 5 6 s and variance a/b2 5 9 s2, and (b) the spectral density

re

to

Page 10: Temporal Autocorrelation in Univariate Linear Modeling of FMRI …pages.stat.wisc.edu/~mchung/teaching/MIA/reading/fMRI... · 2007. 3. 27. · as the Cochrane–Orcutt transformation.

Hcrsia

uS

rm

Df

1379TEMPORAL AUTOCORRELATION IN FMRI

(d) a single-event design with randomized ISI takenfrom a normal distribution with mean 6 s and standarddeviation 2 s with no ISI less than 2 s (Burock et al.,1998; Dale and Buckner, 1997; Dale, 1999) and stimu-lus duration of 0.1 s.

All four designs are convolved with the same gammaHRF,

fG~t; a, b! 5b a

G~a!t a21e 2bt, (19)

where the gamma parameters a, b are set according tomean a/b 5 6 s and variance a/b2 5 9 s2. The gamma

RF for these parameters is shown in Fig. 6. Moreomplicated HRF models could be used. However, anyeasonable HRF model would be expected to have aimilar spectral density and therefore behave in a sim-lar way in this context. For all regressors TR is takens 3 s and all regressors have their means removed.The variance of the parameter estimates, keff s

2, isinversely proportional to the efficiency and can give usa measure of the relative efficiency of the differenttemporal filtering strategies. Although the estimationof s does depend upon the temporal filtering strategy

sed (Eq. (4) depends on S), this effect is negligible.ubsequently, we define a measure of efficiency, E,

TABLE 1

Relative Efficiency E (Eq. (20)) Calculated Using the Threeifferent Strategies for Dealing with the Autocorrelation and

or Four Different Types of Regressor

BoxcarSE/fixed

ISISE/jittered

ISISE/random

ISI

Coloring (Eq. (6)) 0.96 0.70 0.68 0.21Variance correction

(Eq. (7)) 0.99 0.98 0.98 0.80Prewhitening (Eq.

(8)) 1.00 1.00 1.00 1.00

FIG. 7. Plots of (a) the regressor X, (b) the spectral density of X,Toeplitz matrix, and (d) the spectral density of K21X where K21 is t

elative to the maximally efficient prewhitening esti-ator for the regressor as

E 5keff for prewhitening

keff. (20)

This was computed for each regressor using each of thethree different temporal filtering strategies using Eqs.(6), (7), and (8) accordingly (there is only one regressor ineach case so we use c 5 [1]). The low-pass filter used forthe coloring was matched to the HRF (Fig. 6). The valuesof E for the randomized ISI and jittered ISI designs wereaveraged over 100 randomly generated designs.

Results

The results are shown in Table 1. It can be seen thateven for the boxcar design, coloring is not as efficient asvariance correction or prewhitening. For the random-ized ISI design, the contrast is even more apparent,with prewhitening being more efficient than variancecorrection, which in turn is much more efficient thancoloring. This concurs with the theory and work byFriston et al. (2000).

Examination of the spectral density for the random-ized ISI design before and after it has been low-passfiltered (when coloring) illustrates the reason for theloss in efficiency. These spectra are shown in Figs. 10band 10c, respectively. It might have been expected thatthe frequency response in Fig. 10b would have rela-tively little high-frequency content due to convolutionwith the smooth HRF whose spectral density is shownin Fig. 6b. However, there are clearly strong high-frequency components in the regressor. These are in-troduced when the high temporal resolution version ofthe regressor is sampled to a lower temporal resolu-tion. Figure 10c clearly shows a reduction in thesehigh-frequency components due to the low-pass filter-ing, resulting in a loss of efficiency. In contrast, com-paring Fig. 7b with Fig. 7c reveals little differenceparticularly with regard to most of the power being in

the spectral density of SX where S is the coloring low-pass filteringprewhitening matrix from V 5 KKT and X is a boxcar.

(c)he

Page 11: Temporal Autocorrelation in Univariate Linear Modeling of FMRI …pages.stat.wisc.edu/~mchung/teaching/MIA/reading/fMRI... · 2007. 3. 27. · as the Cochrane–Orcutt transformation.

w

1380 WOOLRICH ET AL.

the fundamental frequency. Also see Figs. 8 and 9.Hence for the boxcar, coloring has efficiency similar tovariance correction and prewhitening.

However, this does not explain why variance correctionis less efficient than prewhitening for the randomized ISIdesign. This loss in efficiency is instead due to using amore inefficient estimator when using variance correc-tion, compared with the BLUE of prewhitening. Here, theprewhitening downweights the low frequencies comparedto the high frequencies (inverse of Fig. 5) to give theBLUE. This is of particular benefit when the regressorhas substantial power across a larger range of frequen-cies being weighted, such as is the case for the random-ized ISI design (compare Figs. 10b with 10d).

NONLINEAR SPATIAL SMOOTHING

Given the similarity of temporal autocorrelation inthe local neighborhood of each voxel, we attempt toimprove the robustness of the autocorrelation esti-mates using a small amount of local spatial smoothing.However, our qualitative data analysis indicatedclearly that the autocorrelation differs between tissuetypes. Isotropic spatial smoothing would blur the au-tocorrelation estimates across tissue boundaries, re-sulting in biased estimates of the autocorrelation neartissue boundaries. An alternative approach is to usesome form of nonlinear spatial smoothing that does notsmooth across such boundaries.

Accurate segmentation of white matter, gray matter,and CSF would provide the necessary information to

FIG. 8. As for Fig. 7, but X is a

FIG. 9. As for Fig. 7, but X is a

avoid blurring across tissue types. However, segmen-tation of EPI is hard due to poor tissue type contrast,bias field effects, low resolution, and the partial volumeeffect.

Instead we use the “smoothing over univalue seg-ment assimilating nucleus” (SUSAN) noise-reductionfilter (Smith and Brady, 1997), which is a nonlinearfilter designed to preserve image structure by smooth-ing over only those neighbors which form part of whatis believed to be the “same region,” or USAN, as thecentral voxel under consideration. This concept is illus-trated in one dimension in Fig. 11.

The filter averages over all the pixels in the localitywhich lie in the USAN using the weighting

w~rW, r0W! 5 e 2~~I~rW!2I~r0W!! 2!/~2t 2!, (21)

where t is the brightness difference threshold, I(rW) isthe brightness at any pixel rW, and w is the output of thecomparison. In one dimension the full univariate equa-tion is

J~x! 5SiÞ0I~x 1 i!e 2~i 2/2s s

2!2~~I~x1i!2I~x!! 2!/~2t 2!

SiÞ0e 2~i 2/2s s2!2~~I~x1i!2I~x!! 2!/~2t 2!

, (22)

here ss controls the scale of spatial smoothing, i.e.,the mask size.

We can use a 3D version of this filter to spatiallysmooth the raw autocorrelation estimate at each lag or

gle-event design with fixed ISI.

le-event design with jittered ISI.

sin

sing

Page 12: Temporal Autocorrelation in Univariate Linear Modeling of FMRI …pages.stat.wisc.edu/~mchung/teaching/MIA/reading/fMRI... · 2007. 3. 27. · as the Cochrane–Orcutt transformation.

t

s

otl(v

g

fan

1381TEMPORAL AUTOCORRELATION IN FMRI

to spatially smooth the spectral density estimate ateach frequency. Whether we smooth the autocorrela-tion or the spectral density estimates will depend uponthe autocorrelation/spectral density estimation tech-nique then used on the spatially regularized data.

Normally, the USAN is estimated from the samedata, or image, as that which is being smoothed. How-ever, here we will be using one of the EPI volumes togenerate the USAN.

We would expect a ss of approximately 1 or 2 voxelso be optimal, since the spatial autocorrelation of Sr

suggests that smoothness is in the immediate neigh-borhood only. Simple histogram techniques are used toassess the approximate standard deviation of graymatter in the EPI. The brightness threshold t is thenet to 1

3 of the approximate standard deviation of graymatter. This gives a very conservative brightnessthreshold t, one which allows a small amount ofsmoothing within gray matter while allowing negligi-ble smoothing between matter types.

CALIBRATION

Methods

We now want to ascertain the difference in the biasf the resulting statistical distributions that exists forhe different approaches for estimating the autocorre-ation. This is determined experimentally on real restnull) FMRI data by computing the t statistic at eachoxel for a dummy design paradigm. The t statistic is

iven by t 5 cB/ÎVar{cB}, where B and Var{cB} aregiven by Eqs. (2) and (3), respectively. The t statisticsare then probability transformed to Z statistics. Theprobability transform involves converting the t statis-tic into its corresponding probability (by integratingthe t distribution from the t statistic’s value to infinity)and then calculating the Z statistic that corresponds tothe same probability (by integrating the normal distri-bution from the Z statistic’s value to infinity).

These Z statistics form what we refer to as the nulldistribution. A technique with low bias should give anull distribution that closely approximates the theoret-ical Z distribution (or Normal distribution).

FIG. 10. As for Fig. 7, but X is a si

For the theoretical, Normal probability density func-tion, f (Z), we can obtain the Z statistic, ZP, for a chosenprobability P such that P 5 *ZP

` f (Z)dZ. This can thenbe compared to Pnull 5 1

2 Prob(uZu . ZP) for the empiri-cally obtained null distribution, d(Z). This is given by

Pnull 5S uZu.ZP

d~Z!

2SZ d~Z!. (23)

Since for purposes of inference the tail is the mostimportant part of the distribution, we examine Pnull asar into the tail as the sample size will allow. This isided by using both tails of the empirically obtainedull distribution.We intend to study data taken at TR 5 3 and 1.5 s.

Six different rest/null data sets (three normal volun-teers, two data sets per volunteer) were obtained usingTR 5 3 s and nine null data sets (three normal volun-teers, three data sets per volunteer) were obtainedusing TR 5 1.5 s. For each data set 204 EPI wereacquired using a 3-T system with TE 5 30 ms, in-planeresolution 4 mm, and slice thickness 7 mm. The first 8scans were discarded to leave N 5 196 scans and thedata were motion corrected, intensity normalized bysubtracting the global mean time series from each vox-

e-event design with randomized ISI.

FIG. 11. The SUSAN smoothing method smooths only similarpixels within a local region known as the univalue segment assimi-lating nucleus (USAN). Here the USAN is shown as the white regionwithin the mask.

ngl

Page 13: Temporal Autocorrelation in Univariate Linear Modeling of FMRI …pages.stat.wisc.edu/~mchung/teaching/MIA/reading/fMRI... · 2007. 3. 27. · as the Cochrane–Orcutt transformation.

Znecvpeiwa

p

1382 WOOLRICH ET AL.

el’s time series, and nonlinear high-pass filtered. Wecomputed an empirical distribution based on either allof the TR 5 3 s data or all of the TR 5 1.5 s data. The

statistics for all of the brain voxels in the six or nineull data sets were all pooled together to give onempirical null distribution. The resulting distributionsonsisted of Z statistics from approximately 80,000oxels. This allowed for examination into the tail torobabilities as low as 1e 2 5. It is important that wexamine this far into the tail of the distribution as thiss approximately where inference needs to take placehen multiple comparison corrections are taken intoccount (Worsley et al., 1992).

FIG. 12. Histogram showing the orders of the AR models re-quired to estimate the autocorrelation for all voxels in all six nulldata sets (TR 5 3 s) after applying the randomized ISI single-eventdesign.

FIG. 13. Comparison of Tukey autocorrelation estimation for dagainst null distribution Pnull obtained from six different null data seand (b) a stochastic single-event design convolved with a gamma HRFthe result for what would be a perfect match between theoretical an

We will consider two different paradigms—the sim-le boxcar HRF convolved paradigm (on the TR 5 3 s

data) and the single-event with randomized ISI design(on the TR 5 1.5 and 3 s data) as described earlier.Various autocorrelation estimation techniques will becompared on the calibration plots when performingprewhitening.

Results

Tukey Single Taper

Recall that a suggested value for M is 2=N, whichfor N 5 196 gives M ' 28. This value is compared alongwith a range of others (M 5 5, 10, 15, 28) when usingTukey tapering with no spatial smoothing of the auto-correlation estimate. The results are shown in Figs. 13(TR 5 3 s) and 17a (TR 5 1.5 s).

The first thing to note from Fig. 13 is that when noautocorrelation estimation is made and the residualsare assumed to be white (i.e., S 5 I), then the boxcardesign deviates far more from the theoretical distribu-tion than the single-event design. This is because thesingle-event design has power at frequencies across thefull range and is therefore less affected by not correct-ing for the colored noise in the data which is concen-trated at low frequency, whereas the boxcar design’spower is mostly at its fundamental frequency, which iswithin the range of low-frequency noise. This makesthe obvious point that designs which concentrate theirpower as far away as possible from the low-frequencyend will suffer less from the low-frequency noise in thedata. However, the amount of bias for a randomized ISIwhen no autocorrelation estimation is made is still

rent values of M via log probability plots comparing theoretical Psing TR 5 3 s for (a) a boxcar design convolved with a gamma HRFll are calculated using prewhitening. The straight dotted line showsull distribution.

iffets u. Ad n

Page 14: Temporal Autocorrelation in Univariate Linear Modeling of FMRI …pages.stat.wisc.edu/~mchung/teaching/MIA/reading/fMRI... · 2007. 3. 27. · as the Cochrane–Orcutt transformation.

sbldb

M

waaawasptcict

A

dsp3tsmcf

A

ltwttaS

t

(a1a

1383TEMPORAL AUTOCORRELATION IN FMRI

quite considerable, and so there is still the requirementfor the estimation and correction of the autocorrela-tion.

The values M 5 5 and M 5 10 perform about theame in Figs. 13a and 17a and M 5 5 performs slightlyetter than M 5 10 in Fig. 13b. The key point is thatower values of M, i.e., those that smooth the spectralensity more than is normally recommended, performetter.

ultitapering

Having established that lower values of M performell when using Tukey tapering, we can use Eqs. (16)nd (13) with N 5 196, to give a multitaper withpproximately the same spectral density estimate vari-nce. For M 5 10 this gives NW 5 13. We compare thisith NW 5 4, a commonly used value in the literaturend which corresponds to M 5 32. The results arehown in Fig. 14 for the TR 5 3 s data. NW 5 13erforms much better and is similar to the Tukey es-imator with M 5 10, which is not surprising since theyorrespond to the same variance and therefore to sim-lar amounts of spectral density smoothing. Again, in-reased spectral density smoothing compared withhat which is normally recommended is better.

utoregressive Model and Nonparametric PAVA

When using the autoregressive model of general or-er it was found to require orders of up to 6; Fig. 12hows the histogram of different AR orders required,ooled over all six of the null data sets taken with TR 5s. The results of using the autoregressive model and

he nonparametric PAVA with the TR 5 3 s data arehown in Fig. 15, along with Tukey with M 5 10 andultitapering with NW 5 13. Particularly for the box-

ar design the single-taper Tukey with M 5 10 per-orms the best.

utocorrelation Spatial Smoothing

We have established that without any spatial regu-arization of the autocorrelation estimate, the single-aper Tukey with M 5 5, 10 performs best. We nowant to explore the additional benefits, if any, of using

he SUSAN spatial smoothing of the raw autocorrela-ion estimate before the Tukey tapering is applied andlso to establish how much smoothing is of benefit.patial autocorrelation of the Sr map suggests that the

autocorrelation is correlated only over a short range.The voxel dimensions for the six data sets are 4 3 4 37 mm and hence we consider SUSAN filtering withss 5 4, 8, 12 mm.

Although the single-taper Tukey performed betterwith M 5 10, because we are now regularizing spa-ially, it turns out to be better to allow more flexibility

i.e., less smoothing) of the spectral density by choosingTukey taper with M 5 15. Figures 16 (TR 5 3 s) and

7b (TR 5 1.5 s) show the results of the differentmounts of spatial smoothing. A ss of 8 mm performs

best and shows improvement over performing no spa-tial smoothing for the TR 5 3 s data and performssimilarly for the TR 5 1.5 s data.

DISCUSSION

Prewhitening requires a robust estimator of the au-tocorrelation to maintain low bias and Friston et al.(2000) suggest that current techniques for estimatingthe autocorrelation are not accurate enough to giveprewhitening acceptable bias. However, in Friston etal. (2000) efforts were focused on using global esti-mates of autocorrelation for reasons of computationalefficiency. In this paper local estimation techniqueswere considered and were found to perform at accept-able speeds (less than 5 min for the null data sets usedin this paper).

One interesting characteristic of the calibration/biasplots in Figs. 13–16 is that the empirically obtainedprobabilities are predominantly less than the expectedtheoretical probabilities. It is not clear why this shouldbe the case. One possibility is that all of the autocor-relation techniques are overestimating the noise at lowfrequency. This could be a symptom of the trade-offbetween being sufficiently flexible to model the low-frequency components and avoiding overfitting athigher frequencies. One solution could be to use anonparametric model fitted in the spectral domainwhich allows more flexibility at low frequencies(Marchini and Ripley, 2000). In this paper nonlinearspatial smoothing is used to regularize spatially andthis allows the use of a more flexible M 5 15 Tukeywindow, while at the same time avoiding overfitting.This reduces bias to close to zero.

In similar work by Burock and Dale (2000) a first-order autoregressive model with an extra white noisecomponent is used when performing prewhitening onrandomized ISI designs. They also demonstrate theefficiency gained through prewhitening and show thattheir estimates are unbiased. However, they do notappear to examine the bias as far into the tail as in thispaper. Perhaps more importantly, the plots used toexamine the bias are on a linear scale and this makesassessment of bias at low probabilities very difficult toassess. In this paper we use a log–log scale and find-ings suggest that bias is evident in the tail for generalorder AR models.

It would also be interesting to know more about thesource of temporal autocorrelation, particularly withregard to its spatial nature. In particular, it would beinteresting to understand the source of the increasedautocorrelation in the gray matter, whether it is phys-

Page 15: Temporal Autocorrelation in Univariate Linear Modeling of FMRI …pages.stat.wisc.edu/~mchung/teaching/MIA/reading/fMRI... · 2007. 3. 27. · as the Cochrane–Orcutt transformation.

1384 WOOLRICH ET AL.

iological in origin and how it varies within the graymatter itself.

CONCLUSIONS

As in Friston et al. (2000) we have demonstrated thatwhen using designs such as a boxcar convolved with agamma HRF, coloring can be used with minimal loss ofefficiency. However, for single-event designs with ran-domized ISIs, jittering, or just very short ISIs, coloring

FIG. 14. Comparison of multitapering autocorrelation estimattheoretical P against null distribution Pnull obtained from six differea gamma HRF and (b) a stochastic single-event design convolved widotted line shows the result for what would be a perfect match betw

FIG. 15. Comparison of the different autocorrelation estimation tdistribution Pnull obtained from six different null data sets using TRstochastic single-event design convolved with a gamma HRF. All are cfor what would be a perfect match between theoretical and null dist

is much less efficient and hence prewhitening is desir-able.

Prewhitening requires a robust estimator of the au-tocorrelation to maintain low bias. To estimate theautocorrelation or equivalently the spectral density foruse in prewhitening, different techniques were consid-ered. These were single-tapering Tukey, multitaper-ing, autoregressive model of general order, and a non-parametric approach that assumes monotonicity in theautocorrelation.

for different values of NW via log probability plots comparingull data sets using TR 5 3 s for (a) a boxcar design convolved with

a gamma HRF. All are calculated using prewhitening. The straighttheoretical and null distribution.

niques via log probability plots comparing theoretical P against null3 s for (a) a boxcar design convolved with a gamma HRF and (b) aulated using prewhitening. The straight dotted line shows the resultution.

ionnt ntheen

ech5alcrib

Page 16: Temporal Autocorrelation in Univariate Linear Modeling of FMRI …pages.stat.wisc.edu/~mchung/teaching/MIA/reading/fMRI... · 2007. 3. 27. · as the Cochrane–Orcutt transformation.

3cd

1385TEMPORAL AUTOCORRELATION IN FMRI

Crucially, nonlinear high-pass filtering is performedas a preprocessing step to remove the worst of thenonstationary components and low-frequency noise. ATukey taper, with much greater smoothing of the spec-tral density than is normally recommended in the lit-erature, performed the best when prewhitening.

Importantly, a small amount of spatial smoothing ofthe autocorrelation estimates was also found to be nec-

FIG. 16. Comparison of different amounts of spatial smoothingM 5 15, via log probability plots comparing theoretical P against nul

s for (a) a boxcar design convolved with a gamma HRF and (b) aalculated using prewhitening. The straight dotted line shows the ristribution. MS is the mask size used in the SUSAN smoothing an

FIG. 17. (a) Comparison of Tukey autocorrelation estimation forsmoothing of the raw autocorrelation estimate prior to using Tukeyis made via log probability plots comparing theoretical P against nustochastic single-event design convolved with a gamma HRF. All are cfor what would be a perfect match between theoretical and null dcorresponds to s .

s

essary to reduce bias to close to zero at low probabilitylevels. The autocorrelation was found to vary consider-ably between matter types, with higher positive auto-correlation (low-frequency noise) in the gray mattercompared with the white matter. Therefore, nonlinearspatial smoothing of the autocorrelation was used,which smoothed only within matter types. Using aTukey taper (M 5 15) along with the nonlinear spatial

he raw autocorrelation estimate prior to using Tukey tapering withstribution Pnull obtained from six different null data sets using TR 5chastic single-event design convolved with a gamma HRF. All arelt for what would be a perfect match between theoretical and null

orresponds to ss.

erent values of M and (b) comparison of different amounts of spatialering with M 5 15, for data taken with TR 5 1.5 s. The comparisondistribution Pnull obtained from nine different null data sets with aulated using prewhitening. The straight dotted line shows the resultibution. MS is the mask size used in the SUSAN smoothing and

of tl distoesud c

difftapllalcistr

Page 17: Temporal Autocorrelation in Univariate Linear Modeling of FMRI …pages.stat.wisc.edu/~mchung/teaching/MIA/reading/fMRI... · 2007. 3. 27. · as the Cochrane–Orcutt transformation.

B

B

C

D

W

Z

1386 WOOLRICH ET AL.

smoothing we were able to reduce bias to close to zeroat probability levels as low as 1 3 1025.

ACKNOWLEDGMENTS

The authors thank Dr. Mark Jenkinson (FMRIB Centre, Univer-sity of Oxford) and Jonathan Marchini (Statistics Department, Uni-versity of Oxford), for their help and input into this work, andacknowledge support from the UK MRC and EPSRC.

REFERENCES

Aguirre, G., Zarahn, E., and D’Esposito, M. 1997. Empirical analysesof BOLD fMRI statistics. II. Spatially smoothed data collectedunder null-hypothesis and experimental conditions. NeuroImage5: 199–212.

Bandettini, P., and Cox, W. 2000. Event-related fMRI contrast whenusing constant interstimulus interval: Theory and experiment.Magn. Reson. Med. 43: 540–548.

Bracewell, R. N. 1978. The Fourier Transform and Its Application,2nd ed. McGraw–Hill Kogakusha, New York.

Bullmore, E., Brammer, M., Williams, S., Rabe-Hesketh, S., Janot,N., David, A., Mellers, J., Howard, R., and Sham, P. 1996. Statis-tical methods of estimation and inference for functional MR imageanalysis. Magn. Reson. Med. 35: 261–277.

urock, M. A., Buckner, R. L., Woldorff, M. G., Rosen, B. R., andDale, A. M. 1998. Randomized event-related experimental designsallow for extremely rapid presentation rates using functional MRI.NeuroReport 9: 3735–3739.urock, M. A., and Dale, A. M. 2000. Estimation and detection ofevent-related fMRI signals with temporally correlated noise: Astatistically efficient and unbiased approach. Hum. Brain Mapp.11: 249–260.hatfield, C. 1996. The Analysis of Time Series. Chapman & Hall,London/New York.

Dale, A. 1999. Optimal experimental design for event-related fMRI.Hum. Brain Mapp. 8: 109–114.ale, A., and Buckner, R. 1997. Selective averaging of rapidly pre-sented individual trials using fMRI. Hum. Brain Mapp. 5: 329–340.

Friston, K., Holmes, A., Poline, J.-B., Grasby, P., Williams, S., Frack-owiak, R., and Turner, R. 1995. Analysis of fMRI time seriesrevisited. NeuroImage 2: 45–53.

Friston, K., Josephs, O., Zarahn, E., Holmes, A., Rouquette, S., andPoline, J.-B. 2000. To smooth or not to smooth? NeuroImage 12:196–208.

Hu, X., Le, T., Parrish, T., and Erhard, P. 1995. Retrospective esti-mation and correction of physiological fluctuation in functionalMRI. Magn. Reson. Med. 34: 201–212.

Josephs, O., Turner, R., and Friston, K. 1997. Event-related fMRI.Hum. Brain Mapp. 5: 1–7.

Locascio, J., Jennings, P., Moore, C., and Corkin, S. 1997. Timeseries analysis in the time domain and resampling methods forstudies of functional magnetic resonance brain imaging. Hum.Brain Mapp. 5: 168–193.

Marchini, J., and Ripley, B. 2000. A new statistical approach todetecting significant activation in functional MRI. NeuroImage 12:366–380.

Percival, D. B., and Walden, A. T. 1993. Spectral Analysis for Phys-ical Applications, Multitaper and Conventional Univariate Tech-niques. Cambridge Univ. Press, Cambridge, UK.

Purdon, P., and Weisskoff, R. 1998. Effect of temporal autocorrela-tion due to physiological noise and stimulus paradigm on voxel-level false-positive rates in fMRI. Hum. Brain Mapp. 6: 239–249.

Seber, G. 1977. Linear Regression Analysis. Wiley, New York.Smith, S., and Brady, J. 1997. SUSAN—A new approach to low level

image processing. Int. J. Comput. Vision 23: 45–78.T. Robertson, F. W., and Dykstra, R. 1988. Order Restricted Statis-

tical Inference. Wiley, New York.Woods, R., Mazziotta, J., and Cherry, S. 1993. MRI–PET registration

with automated algorithm. J. Comput. Assisted Tomogr. 17: 536–546.

Worsley, K., Evans, A., Marrett, S., and Neelin, P. 1992. A three-dimensional statistical analysis for CBF activation studies in hu-man brain. J. Cereb. Blood Flow Metab. 12: 900–918.orsley, K., and Friston, K. 1995. Analysis of fMRI time seriesrevisited—Again. NeuroImage 2: 173–181.

arahn, E., Aguirre, G., and D’Esposito, M. 1997. Empirical analysesof BOLD fMRI statistics. I. Spatially unsmoothed data collectedunder null-hypothesis conditions. NeuroImage 5: 179–197.