Univariate Dynamic Screening System: An Approach For ...Univariate Dynamic Screening System: An...

Univariate Dynamic Screening System: An Approach For Identifying

Individuals With Irregular Longitudinal Behavior

Peihua Qiu1 and Dongdong Xiang2

1School of Statistics, University of Minnesota

2School of Finance and Statistics, East China Normal University

Abstract

In our daily life, we often need to identify individuals whose longitudinal behavior is different

from the behavior of those well-functioning individuals, so that some unpleasant consequences

can be avoided. In many such applications, observations of a given individual are obtained se-

quentially, and it is desirable to have a screening system to give a signal of irregular behavior as

soon as possible after that individual’s longitudinal behavior starts to deviate from the regular

behavior, so that some adjustments or interventions can be made in a timely manner. This paper

proposes a dynamic screening system for that purpose in cases when the longitudinal behavior

is univariate, using statistical process control and longitudinal data analysis techniques. Sev-

eral different cases, including those with regularly spaced observation times, irregularly spaced

observation times, and correlated observations, are discussed. Our proposed method is demon-

strated using a real-data example about the SHARe Framingham Heart Study of the National

Heart, Lung and Blood Institute. This paper has supplementary materials online.

Key Words: Correlation; Dynamic screening; Longitudinal data; Process monitoring; Process

screening; Standardization; Statistical process control; Unequal sampling intervals.

1 Introduction

Most products (e.g., airplanes, cars, batteries of notebook computers) need to be checked

regularly or occasionally about certain variables related to their quality and/or performance. If

the observed values of the variables of a given product are significantly worse than the values of a

typical well-functioning product of the same age, then some proper adjustments or interventions

should be made to avoid any unpleasant consequences. This paper suggests a dynamic screening

system for identifying such improperly functioning products effectively.

1

The motivating example of this research is the SHARe Framingham Heart Study of the National

Heart, Lung and Blood Institute. In the study, 1055 patients were recruited, each patient was

followed 7 times, and the total cholesterol level of each patient was recorded during each of the 7

clinic visits. Among the 1055 patients, 1028 of them did not have any strokes during the study, and

the remaining 27 patients had at least one stroke. One major research goal of the study is to identify

patients with irregular total cholesterol level patterns as early as possible, so that some medical

treatments can be taken in a timely manner to avoid the occurrence of stroke. To this end, one

traditional method is to construct pointwise confidence intervals for the mean total cholesterol level

of the non-stroke patient population across different ages, using the longitudinal data of all non-

stroke patients. Then, a patient can be identified as the one who has an irregular total cholesterol

level pattern if his/her total cholesterol levels go beyond the confidence intervals. In the literature,

there have been some discussions about the construction of such confidence intervals. See, for

instance, Ma et al. (2012), Yao et al. (2005), and Zhao and Wu (2008).

The confidence interval methods mentioned above use a cross-sectional comparison approach.

In the context of the total cholesterol level example, they compare patients’ total cholesterol levels

at the cross-section of a given age. One limitation of this approach is that it does not make use of

the medical history of a given patient in the medical diagnostics. If the total cholesterol levels at

different ages of a given patient are regarded as observations of a random process, then intuitively

we should make use of all such observations of the process for medical diagnostics. If a patient’s

total cholesterol level is consistently above the mean total cholesterol level of the non-stroke patient

population for an enough long time period, then that patient should still be identified as a person

who has an irregular total cholesterol level pattern, even if his/her total cholesterol level at any

given time point is within the confidence interval of the mean total cholesterol level of the non-stroke

patient population. Further, it is critical that the diagnostic method should be dynamic in the sense

that a decision about a patient can be made at any time during the observation process as long as

all observations up to the current time point have provided enough evidence for the decision. That

is because the major purpose to monitor a patient’s total cholesterol level is to find any irregular

pattern as soon as possible so that some necessary medical treatments can be given in a timely

manner. The confidence interval methods mentioned in the previous paragraph usually do not

have such a dynamic decision-making property, because they do not follow a patient sequentially

over time. In the literature, some statistical process control (SPC) methodologies, including the

2

cumulative sum (CUSUM) and the exponentially weighted moving average (EWMA) charts, are

designed for such purposes (e.g., Hawkins and Olwell 1998, Montgomery 2009, Qiu 2013). These

control charts evaluate the performance of a process sequentially based on all observed data up

to the current time point. However, they cannot be applied to the current problem directly for

the following several reasons. First, most SPC methods are for monitoring a single process (e.g.,

a production line, a medical system). When the process performs satisfactorily, its observations

follow a so-called in-control (IC) distribution and the process is said to be IC. The major purpose

of the SPC control charts is to detect any distributional shift of the process observations as soon as

possible so that the related process can be stopped promptly for adjustments and repairs. In the

current problem, if each patient is regarded as a process, then there are many processes involved. To

judge whether a patient’s total cholesterol level is IC, we need to take the total cholesterol levels of

all non-stroke patients into consideration. A conventional SPC control chart is not designed for that

purpose. Second, in a typical SPC problem, if we are interested in monitoring the process mean,

then the process mean is assumed to be a constant when the process is IC. In the current problem,

however, the mean total cholesterol level of the non-stroke patients would change over time. Because

of these fundamental differences between the conventional SPC problem and the current problem,

the existing SPC methodologies cannot be applied directly to the current problem.

In this paper, we focus on cases when the variable to monitor (e.g., the total cholesterol level)

is univariate and we are interested in monitoring its mean. In such cases, we propose a dynamic

screening system (DySS) to dynamically identify individuals with irregular longitudinal patterns. In

the DySS method, the regular longitudinal pattern is first estimated from the observed longitudinal

data of a group of well-functioning individuals (e.g., the non-stroke patients in the stroke example).

Then, the estimated regular longitudinal pattern is used for standardizing the observations of an

individual to monitor. Finally, an SPC control chart is applied to the standardized observations of

that individual for online monitoring of its longitudinal behavior.

It should be pointed out that the DySS problem described above looks like a profile monitoring

problem in SPC. But, the two problems are actually completely different. In the profile monitoring

problem, observations obtained from a sampled product are a profile that describes the relationship

between a response variable and one or more explanatory variables, the profiles from different

sampled products are usually ordered by observation times or spatial locations of the sampled

products, and the major goal of the profile monitoring problem is for detecting any shift in the

3

mean and/or variance of the profiles over time/location (cf., e.g., Ding et al. 2006, Jensen and

Birch 2009, Qiu et al. 2010). As a comparison, in the DySS problem, although observations of each

individual look like a profile, each individual is actually treated as a separate process and we are

mainly interested in monitoring the longitudinal behavior of each individual. Different individuals

are often not ordered by time or location.

The remainder of the article is organized as follows. Our proposed DySS methodology is

described in detail in Section 2. A simulation study for evaluating its numerical performance is

presented in Section 3. Our proposed method is demonstrated using the motivating example about

patients’ total cholesterol levels in Section 4. Some concluding remarks are given in Section 5.

Theoretical properties of the estimated IC mean function of the longitudinal response are discussed

in a supplementary file, along with certain technical details and numerical results.

2 The DySS Method

Our proposed DySS method is described in four parts. In the first part, estimation of the regular

longitudinal pattern from the observed longitudinal data of a group of well-functioning individuals

is discussed. Then, an SPC chart is proposed in the second part for monitoring individuals dy-

namically, after their observations are standardized by the estimated regular longitudinal pattern.

In this part, longitudinal observations are assumed to be independent and normally distributed.

Situations when they are normally distributed and autocorrelated and the autocorrelation can be

described by an AR(1) time series model are discussed in the third part. Finally, general cases when

observations are autocorrelated with an arbitrary autocorrelation and when their distribution is

non-normal are discussed in the fourth part.

2.1 Estimation of the regular longitudinal pattern

Assume that y is the variable that we are interested in monitoring, and observations of y for a

group of m well-functioning individuals follow the model

y(tij) = µ(tij) + ε(tij), for i = 1, 2, . . . ,m, j = 1, 2, . . . , J, (1)

where tij is the jth observation time of the ith individual, y(tij) is the observed value of y at

tij , µ(tij) is the mean of y(tij), and ε(tij) is the error term. For simplicity, we assume that the

4

design space is [0, 1]. As in the literature of longitudinal data analysis (e.g., Li 2011), we further

assume that the error term ε(t), for any t ∈ [0, 1], consists of two independent components, i.e.,

ε(t) = ε0(t) + ε1(t), where ε0(·) is a random process with mean 0 and covariance function V0(s, t),

for any s, t ∈ [0, 1], and ε1(·) is a noise process satisfying the condition that ε1(s) and ε1(t) are

independent for any s, t ∈ [0, 1]. In this decomposition, ε1(t) denotes the pure measurement error,

and ε0(t) denotes all possible covariates that may affect y but are not included in model (1). In

such cases, for any s, t ∈ [0, 1], the covariance function of ε(·) is

V (s, t) = Cov (ε(s), ε(t)) = V0(s, t) + σ21(s)I(s = t), for any s, t ∈ [0, 1], (2)

where σ21(s) = Var(ε1(s)), and I(s = t) = 1 when s = t and 0 otherwise. In model (1), observations

of different individuals are assumed to be independent. As a sidenote, model (1) is similar to the

nonparametric mixed-effects model discussed in Qiu et al. (2010). Their major difference is that

the variance of the pure measurement error ε1(t) in (1) is allowed to vary over t, while such a

variance is assumed to be a constant in Qiu et al. (2010). Also, J is assumed to be a constant in

(1) for simplicity. Our method discussed in the paper can be generalized to cases when J depends

on i without much difficulty.

For the longitudinal model specified by (1) and (2), Chen et al. (2005) and Pan et al. (2009)

propose the following estimator of µ(t) based on the local pth-order polynomial kernel smoothing

procedure:

µ̂(t;V ) = e′1

(m∑

i=1

X ′iV−1i Xi

)−1( m∑

i=1

X ′iV−1i yi

), (3)

where e1 is a (p+ 1)-dimensional vector with 1 at the first component and 0 anywhere else,

Xi =

1 (ti1 − t) . . . (ti1 − t)p

......

. . ....

1 (tiJ − t) . . . (tiJ − t)p

J×(p+1)

,

V −1i = K1/2ih (t)(IiΣiIi)

−1K1/2ih (t), Kih(t) = diag{K((ti1 − t)/h), . . . ,Kh((tiJ − t)/h)}/h, K is a

density kernel function, h > 0 is a bandwidth, Ii = diag{I(|ti1 − t| ≤ 1), . . . , I(|tiJ − t| ≤ 1)}, and

Σi is the covariance matrix of yi = (y(ti1), y(ti2), . . . , y(tiJ))′ which can be computed from V (s, t)

in (2). Throughout this paper, the inverse of a matrix refers to the Moore-Penrose generalized

inverse, which always exists.

In practice, the error covariance function V (s, t) is usually unknown, and it needs to be es-

timated from the observed data. To this end, we propose a method consisting of several steps,

5

described below. First, we compute an initial estimator of µ(t), denoted as µ̃(t), by (3) in which

all Σi are replaced by the identity matrix. Obviously, µ̃(t) is just the local pth-order polynomial

kernel estimator based on the assumption that the error terms in (1) at different time points are

i.i.d. for each individual subject. Second, we define residuals

ε̃(tij) = y(tij)− µ̃(tij), for i = 1, 2, . . . ,m, j = 1, 2, . . . , J.

Third, when s 6= t, we use the method originally proposed by Li (2011) to first estimate V0(s, t) in

(2) by

Ṽ0(s, t) = (A1(s, t)V00(s, t)−A2(s, t)V10(s, t)−A3(s, t)V01(s, t))B−1(s, t), (4)

whereA1(s, t) = S20(s, t)S02(s, t)−S211(s, t), A2(s, t) = S10(s, t)S02(s, t)−S01(s, t)S11(s, t), A3(s, t) =

S01(s, t)S20(s, t)− S10(s, t)S11(s, t), B(s, t) = A1(s, t)S00(s, t)−A2(s, t)S10(s, t)−A3(s, t)S01(s, t),

Sl1l2(s, t) =1

mJ(J − 1)

m∑

i=1

J∑

j=1

∑

j 6=j′

(tij − s

h

)l1 ( tij′ − th

)l2Kh(tij − s)Kh(tij′ − t),

Vl1l2(s, t) =1

mJ(J − 1)

m∑

i=1

J∑

j=1

∑

j 6=j′

ε̃(tij)ε̃(tij′ )

(tij − s

h

)l1 ( tij′ − th

)l2Kh(tij − s)Kh(tij′ − t),

Kh(tij − s) = K((tij − s)/h)/h, and l1, l2 = 0, 1, 2. Note that the matrix Ṽ0(s, t) in (4) may not be

semipositive definite. To address this issue, the adjustment procedure (15) in Li (2011) can effec-

tively regularize it to be a well defined covariance matrix, which is described in the supplementary

file. Then, the variance of y(t), denoted as σ2y(t) = V0(t, t) + σ21(t), can be regarded as the mean

function of ε2(t), and it can be estimated from the new data {ε̃2(tij), i = 1, 2, . . . ,m, j = 1, 2, . . . , J}

by (3), in which yi are replaced by (ε̃2(ti1), . . . , ε̃

2(tiJ))′ and Σi are replaced by the identity matrix.

The resulting estimator is denoted as σ̃2y(t). Finally, we define the estimator of V (s, t) by

Ṽ (s, t) = Ṽ0(s, t)I(s 6= t) + σ̃2y(t)I(s = t). (5)

Consequently, the mean function µ(t) can be estimated by µ̂(t; Ṽ ).

Note that, in the four steps described above for computing µ̃(t), Ṽ0(s, t), σ̃2y(t), and µ̂(t; Ṽ ), the

bandwidth h could be chosen differently in each step. We did not distinguish them in the above

description for simplicity. In all our numerical examples presented later, they are allowed to be

different, and each of them is chosen separately and sequentially by the conventional cross-validation

(CV) procedure (cf., the description in the supplementary file) as follows. First, the bandwidth

for µ̃(t) is selected. Then, the bandwidths for Ṽ0(s, t) and σ̃2y(t) are selected, respectively. Finally,

6

the bandwidth for µ̂(t; Ṽ ) can be selected. Some theoretical properties of µ̂(t; Ṽ ) are given in a

proposition of the supplementary file, which shows that µ̂(t; Ṽ ) is statistically consistent under

some regularity conditions.

By the estimation procedure described above, we can compute the local polynomial kernel

estimator µ̂(t; Ṽ ) of the mean function µ(t) in cases when the error covariance function V (s, t) is

unknown. After µ(t) is estimated, the variance function σ2y(t) can be estimated in the same way,

except that the original observations {y(tij)} need to be replaced by

ε̂2(tij) = (y(tij)− µ̂(tij ; Ṽ ))2, for i = 1, 2, . . . ,m, j = 1, 2, . . . , J.

The resulting estimator of σ2y(t) is denoted as σ̂2y(t).

The models (1) and (2) are quite general and they include a variety of different situations as

special cases. In certain applications, the variance structure of the observed data can be described

by a simpler model. For instance, in the SPC literature, we often consider cases when the error

terms at different time points are independent of each other or they follow a specific time series

model. In the current setup, the first scenario corresponds to the case when V0(s, t) = 0, for any

s, t ∈ [0, 1]. In such cases, µ(t) and σ2y(t) can be estimated in the same way as described above,

except that we do not need to estimate V0(s, t).

2.2 SPC charts for dynamically identifying irregular individuals

In the previous part, we define the estimators µ̂(t; Ṽ ) and σ̂2y(t) of the IC mean function µ(t)

and the IC variance function σ2y(t), based on the observed longitudinal data of m well-functioning

individuals which are sometimes called IC data for simplicity. In this part, we propose SPC charts

for sequentially monitoring the longitudinal behavior of a new individual, using the estimated

regular pattern of the longitudinal response variable y(t) that is described by µ̂(t; Ṽ ) and σ̂2y(t).

Assume that the new individual’s y values are observed at times t∗1, t∗2, . . . over the design

interval [0, 1]. In cases when that individual’s longitudinal behavior is IC, these y values should

also follow the model (1). For convenience of discussion, we re-write that model as

y(t∗j ) = µ(t∗j ) + σy(t

∗j )ǫ(t

∗j ), for j = 1, 2, . . . (6)

where σy(t)ǫ(t) here equals ε(t) in model (1). Thus, ǫ(t) in model (6) has mean 0 and variance 1

7

at each t. For the y values of the new individual, let us define their standardized values by

ǫ̂(t∗j ) =y(t∗j )− µ̂(t

∗j ; Ṽ )

σ̂y(t∗j ), for j = 1, 2, . . . (7)

In the case when {y(t∗j ), j = 1, 2, . . .} are independent of each other, and each of them has a normal

distribution, then {ǫ̂(t∗j ), j = 1, 2, . . .} would be a sequence of asymptotically i.i.d. random variables

with a common asymptotic distribution of N(0, 1). If we further assume that {t∗j , j = 1, 2, . . .}

are equally spaced, then {ǫ̂(t∗j ), j = 1, 2, . . .} can be monitored by a conventional control chart,

such as a CUSUM or EWMA chart, as usual. For instance, to detect upward mean shifts in

{ǫ̂(t∗j ), j = 1, 2, . . .}, the charting statistic of the upward CUSUM chart can be defined by

C+j = max(0, C+j−1 + ǫ̂(t

∗j )− k

), for j ≥ 1, (8)

where C+0 = 0, and k > 0 is the allowance constant. The chart gives a signal of an upward mean

shift if

C+j > ρ, (9)

where ρ > 0 is a control limit. To evaluate the performance of the CUSUM chart (8)-(9), we can

use the IC average run length (ARL), denoted as ARL0, which is the average number of time points

from the beginning of process monitoring to the signal time under the condition that the process

is IC, and the out-of-control (OC) ARL, denoted as ARL1, which is the average number of time

points from the occurrence of a shift to the signal time under the condition that the process is OC.

Usually, the allowance k is specified beforehand. It has been well demonstrated in the literature

that large k values are good for detecting large shifts and small k values are good for detecting

small shifts. If it is possible to know the shift size δ in the mean of ǫ̂, then the optimal k is δ/2 (cf.,

Moustakides 1986). The control limit ρ is chosen such that a pre-specified ARL0 value is reached.

Then, the chart performs better for detecting a given shift if its ARL1 value is smaller. For some

commonly used k and ARL0 values, the corresponding ρ values can be found from Table 3.2 in

Hawkins and Olwell (1998). They can also be computed easily by some software packages, such as

the R-package spc (Knoth 2011).

In practice, however, the observation times {t∗j , j = 1, 2, . . .} may not be equally spaced, as

in the total cholesterol level example discussed in Section 1. In such cases, ARL0 and ARL1 are

obviously inappropriate for measuring the performance of the CUSUM chart (8)-(9), and we need to

define new performance measures. To this end, let ω > 0 be a basic time unit in a given application,

8

which is the largest time unit that all unequally spaced observation times are its integer multiples.

Define

n∗j = t∗j/ω, for j = 0, 1, 2, . . .

where n∗0 = t∗0 = 0. Then, t

∗j = n

∗jω, for all j, and n

∗j is the jth observation time in the basic time

unit. In cases when the new individual is IC and the CUSUM chart (8)-(9) gives a signal at the sth

observation time, then n∗s is a random variable measuring the time to a false signal. Its mean E(n∗s)

measures the IC average time to the signal (ATS), denoted as ATS0. If the new individual has an

upward mean shift starting at the τth observation time and the CUSUM chart (8)-(9) gives a signal

at the sth time point with s ≥ τ , then the mean of n∗s−n∗τ is called the OC ATS, denoted as ATS1.

Then, to measure the performance of a control chart (e.g., the chart (8)-(9)), its ATS0 value can be

fixed at a certain level beforehand, and the chart performs better if its ATS1 value is smaller when

detecting a shift of a given size. It is obvious that the values of ATS0 and ATS1 are just constant

multiples of the corresponding values of ARL0 and ARL1 in cases when the observation times are

equally spaced. In such cases, the two sets of measures are equivalent. Also, when the observation

times are unequally spaced, it is possible to use the IC average number of observations to signal

(ANOS), denoted as ANOS0, and the OC average number of observations to signal, denoted as

ANOS1, for measuring the performance of control charts. However, these quantities are also just

constant multiples of ATS0 and ATS1, in cases when the sampling distribution does not change

over time. Furthermore, when the sampling distribution changes over time, ANOS0 and ANOS1

may not be able to measure the chart performance well. As a demonstration, let us consider two

scenarios. In the first scenario, we collect observations at the following basic time units: 1, 3, 7,

. . . , and a signal by a chart is given at the third observation time. In the second scenario, we collect

observations at the following basic time units: 1, 30, 100, . . . , and a signal by another chart is also

given at the third observation time. Assume that the processes are IC in both scenarios. Then, the

IC performance of the two charts are the same if ANOS0 is used as a performance measure. But,

the second chart does not give a false signal until the 100th basic time unit, while the first chart

gives a false signal at the 7th basic time unit. Obviously, ANOS0 cannot tell this quite dramatic

difference. For all these reasons, only ATS0 and ATS1 are used in the remaining part of the paper

for measuring the performance of a control chart.

If the IC mean function µ(t) and the IC variance function σ2y(t) are known, then the standard-

9

ized observations of the new individual can be defined by

ǫ(t∗j ) =y(t∗j )− µ(t

∗j )

σy(t∗j ), for j = 1, 2, . . . (10)

In such cases, as long as the distribution of the observation times {t∗j , j = 1, 2, . . .} is specified

properly, for a given k value in (8) and a given ATS0 value, we can easily compute the corresponding

value of the control limit ρ by simulation such that the pre-specified ATS0 value is reached. For

instance, when the distribution of the observation times is specified by the sampling rate d, which is

defined to be the number of observation times every 10 basic time units here, the computed ρ values

in cases when ATS0 = 20, 50, 100, 150, 200, k = 0.1, 0.2, 0.5, 0.75, 1.0, and d = 2, 5, 10 are presented

in Table 1. From the table, it can be seen that ρ increases with ATS0 and d, and decreases with

k. In cases when µ(t) and σ2y(t) are unknown, they need to be estimated from an IC dataset by

µ̂(t; Ṽ ) and σ̂2y(t), as discussed in Section 2.1. In such cases, the standardized observations of the

new individual defined by (7) are only asymptotically N(0, 1) distributed when the new individual

is IC. When the IC sample size is moderate to large, we can still use the control limit ρ computed

in the case of (10), which will be justified by some numerical results described in the next section.

Table 1: Control limits of the CUSUM chart (8)-(9) for known IC mean and variance functions andvarious combinations of ATS0, k, and d described in the text. For the entry with a “∗” symbol,the actual ATS0 value is more than 1% away from the assumed ATS0 value. For all other entries,the actual ATS0 values are within 1% of the assumed ATS0 values.

d k ATS0 = 20 ATS0 = 50 ATS0 = 100 ATS0 = 150 ATS0 = 200

0.1 0.929 1.844 2.820 3.552 4.1710.2 0.774 1.571 2.351 2.928 3.396

2 0.5 0.382 0.986 1.493 1.822 2.0880.75 0.118 0.654 1.064 1.312 1.5071.0 0.001∗ 0.355 0.718 0.936 1.101

0.1 1.844 3.215 4.612 5.631 6.4450.2 1.571 2.664 3.713 4.381 5.005

5 0.5 0.986 1.685 2.234 2.586 2.8980.75 0.654 1.200 1.612 1.869 2.0721.0 0.355 0.837 1.189 1.404 1.562

0.1 2.820 4.612 6.407 7.649 8.6120.2 2.351 3.713 4.937 5.744 6.408

10 0.5 1.493 2.235 2.852 3.245 3.5420.75 1.061 1.610 2.039 2.310 2.5091.0 0.718 1.192 1.538 1.738 1.894

In the SPC literature, the concept of ATS has been proposed for measuring the performance of

10

a control chart with variable sampling intervals (VSIs). See, for instance, Reynolds et al. (1990).

However, the VSI problem in SPC is completely different from the DySS problem discussed here. In

the VSI problem, the next observation time is determined by the current and all past observations

of the process; the next observation is collected sooner if there is more evidence of a shift based on

all available process observations, and later otherwise. Therefore, the VSI scheme is an integrated

part of process monitoring. As a comparison, in the DySS problem, observation times are often

not specifically chosen for improving process monitoring. In the total cholesterol level example

discussed in Section 1, for instance, patients often choose their clinic visit times based on their

convenience or health conditions.

At the end of this subsection, we would like to make several remarks about the DySS problem

and the CUSUM chart (8)-(9). First, the above description focuses on cases to detect upward

shifts in the process mean function µ(t) only. Control charts for detecting downward mean shifts or

arbitrary mean shifts can be developed in the same way, except that the upward charting statistic

in (8) and the decision rule in (9) should be changed to the downward or two-sided version (cf., Qiu

2013, Chapter 4). Second, control charts for detecting shifts in the process variance function σ2y(t),

or shifts in both µ(t) and σ2y(t), can be developed in a similar way, after (8) and (9) are replaced

by appropriate charting statistics and the related decision rules. See, for instance, Gan (1995) and

Yeh et al. (2004) for discussions about the corresponding problems in the conventional SPC setup.

Third, although a CUSUM chart is discussed above, EWMA charts and charts based on change-

point detection (CPD) can also be used in the DySS problem. Generally speaking, the CUSUM

and EWMA charts have similar performance, and the CPD charts can provide estimates of the shift

position at the time when they give signals although their computation is usually more extensive.

See Qiu (2013) for a related discussion. Fourth, to use the proposed DySS method, the observation

times {t∗j} of the new individual to monitor usually cannot be larger than the largest value of the

observation times {tij} in the IC data, to avoid extrapolation in the data standardization in (7).

Or, it is inappropriate to use our DySS method to monitor the new individual outside the time

range

[mini,j

tij , maxi,j

tij ].

From the description about the DySS method above, it can be seen that this methodology actually

combines the cross-sectional comparison between the new individual to monitor and the m well-

functioning individuals in the IC data with the sequential monitoring of the new individual in

11

a dynamic manner. Therefore, to use the method to monitor the longitudinal behavior of the

new individual at time t, there should exist observations around t in the IC data to make the

cross-sectional comparison possible. Fifth, although the simplest CUSUM chart (8)-(9) is used

in this paper to demonstrate the DySS method, some existing research in the SPC literature to

accommodate correlated data (e.g., Apley and Tsung 2002, Jiang 2004) or to detect dynamic mean

shifts (e.g., Han and Tsung 2006, Shu and Jiang 2006) might also be considered here. In the

numerical study presented in Section 3, the control charts by Han and Tsung (2006) and Shu and

Jiang (2006) will be discussed.

2.3 Cases when the longitudinal observations follow AR(1) model

In the above discussion, we assume that the longitudinal observations {y(t∗j ), j = 1, 2, . . .}

are independent. In practice, they are often correlated. In the SPC literature, time series models,

especially the AR(1) model, are commonly used for describing the autocorrelation in such cases (cf.,

Jiang et al. 2000, Lu and Reynolds 2001). In this section we discuss how to use the proposed DySS

procedure in cases when the longitudinal observations follow an AR(1) model. In a conventional

time series model, observation times are usually equally spaced. In the current DySS problem,

however, they can be unequally spaced. In such cases, time series modeling is especially challenging

(cf., Maller et al. 2008, Vityazev 1996), and our DySS approach with a more general time series

model is not available yet, which is left for our future research.

Let us first discuss estimation of the regular longitudinal pattern from an IC dataset when

the longitudinal data are correlated and follow an AR(1) model. Assume that there are m well-

functioning individuals whose longitudinal observations follow the model (1) in which ε(tij) =

σy(tij)ǫ(tij) and ǫ(tij) has mean 0 and variance 1 for each tij ∈ [0, 1]. Instead of assuming indepen-

dence, in this subsection, we assume that ǫ(tij) follow the AR(1) model

ǫ(tij) = φǫ(tij − ω) + e(tij), for i = 1, ...,m, j = 1, ..., J, (11)

where φ is a coefficient, ω is the basic time unit defined in Section 2.2, and e(t) is a white noise

process over [0, 1]. The model (11) is the conventional AR(1) model for cases with equally spaced

observation times. It does not have a constant term on the right-hand-side because the mean of

ǫ(tij) is 0. In the current DySS problem, it is inconvenient to work with the model (11) directly

because there may not be observations at the times tij − ω that are used in (11). To overcome

12

this difficulty, we can transform the AR(1) model (11) to the following time series model using the

actual observation times {tij}:

ǫ(tij) = φ∆i,j−1ǫ(ti,j−1) + Θ̃ij(z)e(tij), (12)

where ∆i,j−1 = (tij − ti,j−1)/ω, z is the lag operator used in time series modeling (e.g., zǫ(t) =

ǫ(t− ω)), and

Θ̃ij(z) = 1 + φz + · · ·+ φ∆i,j−1−1z∆i,j−1−1

is a lag polynomial. As in Sections 2.1 and 2.2, let µ̂(t; Ṽ ) and σ̂2y(t) be the local pth-order

polynomial kernel estimators of µ(t) and σ2y(t), and

ǫ̂(tij) =y(tij)− µ̂(tij ; Ṽ )

σ̂y(tij), for i = 1, 2, . . . ,m, j = 1, 2, . . . , J.

Then, the least squares (LS) estimator of φ, denoted as φ̂, can be obtained by minimizing

m∑

i=1

J∑

j=2

{ǫ̂(tij)− φ

∆i,j−1 ǫ̂(ti,j−1)}2

.

Next, we use the estimated IC longitudinal pattern described by µ̂(t; Ṽ ), σ̂2y(t) and φ̂ to monitor

a new individual’s longitudinal pattern. Assume that the new individual’s y observations follow

the model (6), in which the error term ǫ(t∗j ) follows the AR(1) model

ǫ(t∗j ) = φǫ(t∗j − ω) + e(t

∗j ), for j = 1, 2, . . .

where φ is the same coefficient as the one in (11). Then, similar to the relationship between models

(11) and (12), the above AR(1) model implies that

ǫ(t∗j ) = φ∆∗j−1ǫ(t∗j−1) + e

∗j , (13)

where ∆∗j−1 = (t∗j − t

∗j−1)/ω, e

∗j = Θ̃

∗j (z)e(t

∗j ), and

Θ̃∗j (z) = 1 + φz + · · ·+ φ∆∗j−1−1z∆

∗

j−1−1.

It can be checked that, when the new individual is IC, {e∗j , j = 1, 2, . . .} are i.i.d. with mean 0 and

variance σ2e∗ = 1 − φ2∆∗j−1 . In the SPC literature, it has been well demonstrated that, when ǫ(t∗j )

follows the time series model (13), to detect an upward mean shift in ǫ(t∗j ), we can just monitor

{e∗j , j = 1, 2, . . .} (cf., e.g., Lu and Reynolds 2001). Based on all these results, to detect an upward

13

mean shift in the original response y, the charting statistic of the upward CUSUM chart can be

defined by

C+j = max

[0, C+j−1 +

(ǫ̂(t∗j )− φ̂

∆∗j−1 ǫ̂(t∗j−1))/√

1− φ̂2∆∗

j−1 − k

], for j ≥ 2, (14)

where C+1 = 0, k > 0 is an allowance constant, and ǫ̂(t∗j ) = (y(t

∗j ) − µ̂(t

∗j ; Ṽ ))/σ̂y(t

∗j ). The chart

gives a signal when

C+j > ρ, (15)

where ρ > 0 is a control limit chosen to achieve a given ATS0 value.

At the end of this subsection, we would like to point out that there actually are two different

ways to estimate µ(t), σ2y(t), and φ from an IC data when the AR(1) model (12) is involved.

Because the estimation procedure described in Section 2.1 does not require the error covariance

function V (s, t) to be known, the first way is to estimate µ(t) and σ2y(t) by the four steps described

in Section 2.1 without using any prior information about the error structure, and then estimate φ

by the LS procedure described above. Alternatively, in the case when we are sure that the AR(1)

model (12) is appropriate for describing the error structure, a parametric formula of V (s, t) can be

derived in terms of φ, denoted as Vφ(s, t). Then, µ(t) and σ2y(t) can be estimated by µ̂(t;Vφ) and

σ̂2y,φ(t), respectively, where σ̂2y,φ(t) is the same as σ̂

2y(t) except that µ̂(t;Vφ) instead of µ̂(t; Ṽ ) is

used in its construction. Then, φ can be estimated by the LS estimator defined by the solution of

minφ

m∑

i=1

J∑

j=2

{ǫ̂φ(tij)− φ

∆i,j−1 ǫ̂φ(ti,j−1)}2

,

where

ǫ̂φ(tij) =y(tij)− µ̂(tij ;Vφ)

σ̂y,φ(tij), for i = 1, 2, . . . ,m, j = 1, 2, . . . , J.

We have checked that results by these two approaches are close to each other as long as the IC

sample size is not small. In all simulation examples discussed in Section 3, the second approach is

used, while the first approach is used in the real data example discussed in Section 4 because we

are not sure whether the AR(1) model is appropriate at the beginning of data analysis.

2.4 Cases with arbitrary autocorrelation

In the previous two parts, we assume that the original observations {y(t∗j ), j = 1, 2, . . .} of

a new individual are either independent or correlated with the autocorrelation described by the

14

AR(1) model (11). Also, the observation distribution is assumed to be normal. In practice, all

these assumptions could be violated. If one or more such model assumptions are violated, then the

related control charts (8)-(9) and (14)-(15) may not be reliable because their actual ATS0 values

could be substantially different from the assumed ATS0 values. See related discussions in Hawkins

and Deng (2010), Qiu and Hawkins (2001), Qiu and Li (2011a,b), and Zou and Tsung (2010).

In this subsection, we propose a numerical approach to compute the control limit ρ of the chart

(8)-(9) from an IC data such that the chart is still appropriate to use in cases when within-subject

observations are correlated with an arbitrary autocorrelation and when a parametric form of their

distribution is unavailable.

Assume that there is an IC dataset consisting of a group of m well-functioning individuals

whose observations of y follow the model (1). The data of the first m1 individuals are used for

obtaining estimators µ̂(t; Ṽ ) and σ̂2y(t), as discussed in Section 2.1. Then, the control limit ρ of the

chart (8)-(9) can be computed from the remaining m2 = m−m1 individuals using a block bootstrap

procedure (e.g., Lahiri 2003) described below. First, we compute the standardized observations of

the m2 well-functioning individuals by


σ̂y(tij), for i = m1 + 1,m1 + 2, . . . ,m, j = 1, 2, . . . , J.

Second, we randomly select B individuals with replacement from the m2 well-functioning individ-

uals, and use their standardized observations to compute the value of ρ by a numerical searching

algorithm so that a given ATS0 level is reached. Such a searching algorithm has been well described

in Qiu (2008) and Qiu (2013, Section 4.2). Basically, we first give an initial value to ρ, and compute

the actual ATS0 value of the chart (8)-(9) from the standardized observations of the B resampled

individuals. If the actual ATS0 value is smaller than the given ATS0 value, then the value of ρ

is increased; otherwise, the value of ρ is decreased. Then, the actual ATS0 value of the chart is

computed again, using the updated value of ρ. This process is repeated until the given ATS0 level

is reached to a certain accuracy. In all examples of this paper, we choose m1 = m/2.

It should be pointed out that, if it is reasonable to assume that the within-subject observations

are independent, then we can also draw bootstrap samples in the conventional way from the set

of residuals {ǫ̂(tij), i = m1 + 1,m1 + 2, . . . ,m, j = 1, 2, . . . , J}. Namely, we can randomly select J

residuals with replacement from this set as the standardized observations of a bootstrap resampled

“individual”, and B bootstrap resampled individuals can be generated in that way for computing

the value of ρ. When the within-subject observations are correlated, it has been demonstrated in

15

the literature that a block bootstrap procedure, such as the one described above, would be more

appropriate to use (cf., Lahiri 2003).

3 Numerical Study

In this section, we present some simulation results to investigate the numerical performance of

the proposed DySS procedure described in the previous section. In estimating the mean function

µ(t) and the variance function σ2y(t) of the longitudinal response y(t) from an IC data by the local

smoothing method described in Section 2.1, p is fixed at 1 (i.e., the local linear kernel smoothing is

used), the kernel function is chosen to be the Epanechnikov kernel functionK(t) = 0.75(1−t2)I(|t| ≤

1) that is widely used in the curve estimation literature, and all bandwidths are chosen by the CV

procedure.

In the first example, it is assumed that the IC mean function µ(t) and the IC variance function

σ2y(t) are known to be

µ(t) = 1 + 0.3t1

2 , σ2y(t) = µ2(t), for t ∈ [0, 1],

observations of a given individual at different time points are independent of each other, and the

sampling rate of different individuals are the same to be d observations every 10 basic time units. In

such cases, the standardized observations are defined in (10), and the control limit ρ of the CUSUM

chart (8)-(9) can be easily computed by a numerical algorithm, after its allowance constant k and

the ATS0 value are pre-specified (cf., Table 1 and the related description in Subsection 2.2). Now,

let us consider the mean shifts from µ(t) to

µ1(t) = µ(t) + δ, for t ∈ [0, 1],

that occur at the initial time point, where δ is the shift size taking values from 0 to 2.0 with a step

of 0.1. The basic time unit in this example is w = 0.001, and the sampling rate d is chosen to be 2,

5, or 10. In the CUSUM chart (8)-(9), its ATS0 value is fixed at 100, and the allowance constant

k is chosen to be 0.1, 0.2 or 0.5. The corresponding ρ values are computed to be 2.820, 2.351, and

1.493 when d = 2; 4.612, 3.713, and 2.234 when d = 5; and 6.407, 4.937, and 2.852 when d = 10.

The ATS1 values of the chart in cases when d = 2, 5, 10, computed based on 10,000 replicated

simulations, are shown in Figure 1(a)-(c). From the plots, it can be seen that: (i) the chart with a

smaller (larger) k performs better for detecting small (large) shifts, and (ii) the ATS1 values tend

16

to be smaller when d is larger. The first result is generally true for CUSUM charts (cf., Hawkins

and Olwell 1998), and the second result is intuitively reasonable because more observations are

available for process monitoring when d is chosen larger.

0.0 0.5 1.0 1.5 2.0delta

Ave

rage

Tim

e to

Sig

nal

25

1540

100

k=0.1k=0.2k=0.5

(a)

0.0 0.5 1.0 1.5 2.0delta

k=0.1k=0.2k=0.5

(b)

0.0 0.5 1.0 1.5 2.0delta

k=0.1k=0.2k=0.5

(c)

0.0 0.5 1.0 1.5 2.0delta

Ave

rage

Tim

e to

Sig

nal

2535

5070

100

k=0.1k=0.2k=0.5

(d)

0.0 0.5 1.0 1.5 2.0delta

k=0.1k=0.2k=0.5

(e)

0.0 0.5 1.0 1.5 2.0delta

k=0.1k=0.2k=0.5

(f)

Figure 1: (a)-(c) ATS1 values of the chart (8)-(9) in cases when the IC process mean function µ(t)and the process variance function σ2y(t) are assumed known, k = 0.1, 0.2, 0.5, ATS0=100, step shiftsize δ changes its value from 0 to 2 with a step of 0.1, and the sampling rate d = 2 (plot (a)), d = 5(plot (b)), and d = 10 (plot (c)). (d)-(f) Corresponding results when the mean shift is a nonlineardrift from µ(t) to µ1(t) = µ(t) + δ(1 − exp(−10t)), for t ∈ [0, 1]. The y-axes of all plots are innatural log scale.

In practice, a more realistic scenario would be that, once a shift occurs, the shift size would

change over time. In the total cholesterol level example discussed in Section 1, if a patient’s total

cholesterol level begins to deviate from the regular pattern and no medical treatments are given in

a timely fashion, then his/her total cholesterol level could deviate from the regular pattern more

and more over time. To simulate such scenarios, we consider a nonlinear drift from µ(t) to

µ1(t) = µ(t) + δ (1− exp(−10t)) , for t ∈ [0, 1]

that occurs at the initial time point, where δ is a constant. When δ changes from 0 to 2.0 with a

17

step of 0.1 and other settings remain the same as those in Figure 1(a)-(c), the ATS1 values of the

chart (8)-(9) are shown in Figure 1(d)-(f). From the plots, similar conclusions to those from Figure

1(a)-(c) can be made here. We also performed simulations in cases when w = 0.005 and ATS0 = 20.

The results are included in a supplementary file, and they demonstrate a similar pattern to that in

Figure 1.

Next, we consider a more realistic situation when the IC mean function µ(t) and the IC variance

function σ2y(t) are both unknown and they need to be estimated from an IC dataset. Assume that

the IC dataset includes observations of m individuals. For each individual, the sampling rate is d,

which is assumed to be the same as that of the new individual for online monitoring, and the basic

time unit is ω = 0.001. Then, when d = 2, each individual in the IC dataset has 200 observations

in the design interval [0, 1]. After µ(t) and σ2y(t) are estimated from the IC dataset, observations of

the new individual are standardized by (7) for online monitoring. The standardized observations

{ǫ̂(t∗j )} are regarded as i.i.d. random variables with the common distribution N(0, 1) when the new

individual is IC. When ATS0, k and d are given, the control limit ρ can be computed as if µ(t) and

σ2y(t) are known (cf., Table 1). Then, the actual ATS0 value of the chart (8)-(9) is computed using

10,000 simulations of the online monitoring. This entire process, starting from the generalization of

the IC data to the computation of the actual ATS0 value, is then repeated 100 times. The averaged

actual ATS0 values in cases when the nominal ATS0 = 100, k = 0.1, 0.2 or 0.5, d = 2, 5 or 10, and

m = 5, 10 or 20 are presented in Figure 2, along with the ATS1 values of the chart for detecting

step shifts of size δ = 0.25, 0.5, 0.75, and 1.0, occurring at the initial time point. These ATS0 and

ATS1 values are also presented in Table S.1 of the supplementary file. From the figure and the

table, it can be seen that: (i) the actual ATS0 values are within 5% of the nominal ATS0 value of

100 in all cases when m ≥ 10, (ii) the actual ATS0 values are closer to the nominal ATS0 value

when d is larger, (iii) the ATS1 values are smaller if δ is larger or d is larger, (iv) the ATS1 values

do not depend on the m value much, especially when δ and d are large, and (v) the ATS1 values

are the smallest when k = 0.1 and δ is small (i.e., δ = 0.25), or when k = 0.2 and δ is medium

(e.g., δ = 0.5 or 0.75), or when k = 0.5 and δ is large (i.e., δ = 1.0). The corresponding results for

detecting the drifts considered in Figure 1(d)-(f) are presented in Figure S.2 and Table S.2 of the

supplementary file. Results in cases when ATS0 = 20 are presented in Tables S.3 and S.4 of the

supplementary file. Similar conclusions can be made from all these results.

In the above examples, the sampling rate d is chosen as 2, 5, and 10. In cases when ω = 0.001,

18

0.0 0.2 0.4 0.6 0.8 1.0

020

4060

8010

0

delta

ATS

m=5m=10m=20

0.0 0.2 0.4 0.6 0.8 1.0delta

m=5m=10m=20

0.0 0.2 0.4 0.6 0.8 1.0delta

m=5m=10m=20

0.0 0.2 0.4 0.6 0.8 1.0

020

4060

8010

0

delta

ATS

m=5m=10m=20

0.0 0.2 0.4 0.6 0.8 1.0delta

m=5m=10m=20

0.0 0.2 0.4 0.6 0.8 1.0delta

m=5m=10m=20

0.0 0.2 0.4 0.6 0.8 1.0

020

4060

8010

0

delta

ATS

m=5m=10m=20

0.0 0.2 0.4 0.6 0.8 1.0delta

m=5m=10m=20

0.0 0.2 0.4 0.6 0.8 1.0delta

m=5m=10m=20

Figure 2: Actual ATS0 and ATS1 values of the chart (8)-(9), for detecting step mean shift of thesize δ occurring at the initial time point, in cases when the IC mean function µ(t) and the ICvariance function σ2y(t) are estimated from an IC data with m individuals, m = 5, 10 or 20, k = 0.1(1st column), 0.2 (2nd column) or 0.5 (3rd column), d = 2 (1st row), 5 (2nd row) or 10 (3rd row),ω = 0.001, and the nominal ATS0 is 100.

19

each individual would have 200, 500, and 1000 observations, respectively, in the design interval

[0, 1]. In some applications, such as the total cholesterol level example discussed in Section 4, the

individuals have fewer observations. Next, in the example of Figure 2, we choose d to be 0.2, 0.5,

and 1, m to be 10, 20, and 40, and the other parameters to be unchanged. The corresponding

results of the actual ATS0 and ATS1 values of the chart (8)-(9) for detecting the step shifts are

shown in Table S.5 and Figure S.3 of the supplementary file. The results for detecting the drifts

are shown in Table S.6 and Figure S.4 of the supplementary file. From these figures and tables, it

can be seen that most conclusions made above from Figures 2 and S.2 are still valid here, except

that (i) for a given m value, the actual ATS0 values in this case seem a little farther away from

the nominal ATS0 value of 100, compared to the actual ATS0 values shown in Figures 2 and S.2,

(ii) it requires m ≥ 20 in this case so that the actual ATS0 values would be within about 5% of

the nominal ATS0 value, and (iii) the actual ATS1 values in this case are generally larger than the

corresponding values shown in Figures 2 and S.2. All these results are intuitively reasonable.

In the literature, there are some existing control charts designed specifically for detecting time-

varying mean shifts, or mean drifts. One type of such control charts is based on the idea to design

a chart adaptively using a dynamic prediction of the shift size. Such adaptive CUSUM or EWMA

charts (cf., Capizzi and Masarotto 2003, Shu and Jiang 2006, Sparks 2000) can be used in our

proposed DySS method in place of the conventional CUSUM chart (8)-(9). Another control chart

designed for detecting time-varying mean shifts is the reference-free cuscore (RFCuscore) chart by

Han and Tsung (2006). Next, we consider using the adaptive CUSUM chart by Shu and Jiang

(2006) and the RFCuscore chart by Han and Tsung (2006) in our DySS method. By the adaptive

CUSUM chart, the charting statistic C+j in (8) needs to be changed to

C+j,A = max[0, C+j−1,A +

(ǫ̂(t∗j )− δ̂

+j /2

)/ρ(δ̂+j /2)

], for j ≥ 1,

where C+0,A = 0, δ̂+j is the predicted shift size at time j defined by

δ̂+j = max(δ+min, (1− λ)δ̂

+j−1 + λǫ̂(t

∗j )), for j ≥ 1,

δ̂+0 = δ+min, δ

+min is a parameter denoting the minimum shift size, λ ∈ (0, 1] is a weighting parameter,

ρ(k) is a control limit defined by

ρ(k) = [log(1 + 2k2ARL0 + 2.332k)]/2k − 1.166,

and ARL0 = ATS0 ∗ d/10 is the pre-specified IC ARL value. The chart gives a signal at time j

when C+j,A > 1. As in Shu and Jiang (2006), we choose λ = 0.1 in this procedure. The RFCuscore

20

chart gives a signal at time j if

max1≤w≤j

j∑

i=j−w+1

|ǫ̂(t∗i )| (ǫ̂(t∗i )− |ǫ̂(t

∗i )|/2)

≥ c,

where c is a control limit chosen to reach a given ATS0 level. In the example of Figures 2 and

S.2, the actual ATS0 and ATS1 values of the adaptive CUSUM chart and the RFCuscore chart

are computed in the same way as those of the chart (8)-(9), in cases when δ+min is chosen to be

0.2, 0.4, and 1.0 and other parameters are chosen to be the same values as those in Figures 2 and

S.2. These values are presented in Tables S.7-S.9 and Figures S.5-S.7 of the supplementary file. In

cases when m = 20, k = 0.2 in the chart (8)-(9), and δ+min = 0.4 in the adaptive CUSUM chart,

the computed actual ATS0 and ATS1 values of the three charts are shown in Figure S.8 in the

supplementary file, for detecting both step shifts and drifts. From these results, it can be seen

that (i) the conventional CUSUM chart (8)-(9) and the adaptive CUSUM chart perform almost

identically in all cases considered, and (ii) the RFCuscore chart performs slightly worse in certain

cases, especially when detecting drifts.

The supplementary file contains some results in cases when observations at different time

points are correlated and their error terms follow the AR(1) model (11). These results show that

the CUSUM chart (14)-(15), which is based on the standardized error terms, is indeed effective

in detecting mean shifts in the original process observations. In cases when µ(t), σ2y(t) and φ are

unknown and they need to be estimated from an IC dataset, it seems that more IC data are needed

to make the actual ATS0 values of the chart (14)-(15) to be within 10% of the nominal ATS0 value,

compared to the results in Figure 2 when within-subject observations are independent. Finally,

in the supplementary file, we also consider an example in which the within-subject observations

are correlated, but the correlation does not follow the AR(1) model (11). The block bootstrap

procedure discussed in Section 2.4 is used in designing our proposed DySS method. From the

results in this example, it can be seen that the DySS method still performs well in such cases.

4 Application to the Total Cholesterol Level Example

In this section, we apply our proposed DySS method to the total cholesterol level example that

is briefly described in Section 1. This example is a part of the SHARe Framingham Heart Study

of the National Heart Lung and Blood Institute (cf., Cupples et al. 2007). In the data, there are

21

1028 non-stroke patients (i.e. m = 1028) and 27 stroke patients. In the study, each patient was

followed 7 times (i.e., J = 7), and the total cholesterol level (in mg/100ml) was recorded at each

time. Because the total cholesterol level, denoted as y, is an important risk factor of stroke, it is

important to detect its irregular temporal pattern so that medical interventions can be applied in a

timely manner and stroke can be avoided. Here, we demonstrate that our proposed DySS method

can be used for identifying irregular total cholesterol level patterns in this example.

First, the observed data of the 1028 non-stroke patients are used as the IC data, from which

we can estimate the IC mean function µ(t) and the IC variance function σ2y(t). To this end, the

covariance function V (s, t) is first estimated by (5), and then µ(t) and σ2y(t) are estimated by

µ̂(t; Ṽ ) and σ̂2y(t), respectively, as described in Section 2.1. The 95% pointwise confidence band

of µ(t), defined to be µ̂(t; Ṽ ) ± 1.96σ̂y(t), and µ̂(t; Ṽ ) are presented in Figure 3 with the observed

total cholesterol levels of the 27 stroke patients. From the figure, it can be seen that only 6 out of

27 stroke patients are detected to have irregular total cholesterol level patterns by this confidence

interval approach because their observed total cholesterol levels exceed the upper confidence levels

at least once. From the figure, it can also be seen that there are no stroke patients whose observed

total cholesterol levels exceed the lower confidence levels.

After µ(t) and σ2y(t) are estimated, we can compute the standardized residuals by


σ̂y(tij), for i = 1, ...,m, j = 1, ..., J.

To address the possible autocorrelation among {ǫ̂(tij)}, we consider the AR(1) model (11) (see also

(12)), and the coefficient φ in the model is estimated by the LS procedure described in Section 2.3

to be φ̂ = 0.9217. The goodness-of-fit of the AR(1) model is studied as follows. First, the predicted

values of the model (cf., (12))

ǫ̃(tij) = φ̂∆i,j−1 ǫ̂(ti,j−1)

are ordered from the smallest to the largest and then divided into g groups of the same size. Namely,

the kth group includes all predicted values in the interval Ik = [qk−1, qk), for k = 1, 2, . . . , g, where

qk is the (k/g)th quantile of the predicted values {ǫ̃(tij)}, for k = 1, 2, . . . , g − 1, q0 = −∞, and

qg = ∞. Let Ok be the proportion of all standardized residuals {ǫ̂(tij)} in the interval Ik, for

k = 1, 2, . . . , g. Then, the following Pearson’s χ2 statistic:

X2 =

g∑

k=1

(Ok − 1/g)2

1/g

22

20 30 40 50 60 70 80

5010

015

020

025

030

035

0

Age

Tota

l Cho

lest

erol

Lev

el (

g/m

l)

Figure 3: The 95% pointwise confidence band of the mean total cholesterol level µ(t) (dashedcurves), the point estimator µ̂(t; Ṽ ) of µ(t) (dark solid curve), and the observed total cholesterollevels of the 27 stroke patients (thin curves).

measures the discrepancy between the distribution of the standardized residuals and the distribution

of their predicted values by the AR(1) model. If the AR(1) model fits the data well, then the

distribution of X2 should be approximately χ2g−2, where the degree of freedom is g − 2 because

there is one parameter (i.e., φ) in the AR(1) model that needs to be estimated beforehand. If we

choose g = 12, then the observed value of X2 is 0.2265, giving a p-value of about 1.0. We also

tried other g values in the interval [5, 50], and similarly large p-values were obtained. Therefore,

the AR(1) model seems to fit the standardized residuals well.

After µ(t), σ2y(t) and φ are all estimated, next we design a CUSUM chart to detect irregular

longitudinal patterns of the total cholesterol levels of the 27 stroke patients. In this example,

because we are mainly concerned about the upward shifts in patients’ total cholesterol levels, only

the upward CUSUM chart (14)-(15) is considered here. Since each stroke patient has an average

of 2.3 observations every 10 years (cf., Figure 3), we choose d = 2. In the CUSUM chart, we

choose k = 0.1 and ATS0 = 25. In such cases, the control limit ρ is computed to be 0.927. The

CUSUM charts for monitoring the 27 stroke patients are presented in Figure 4, from which we can

see that 22 out of the 27 stroke patients are detected to have upward mean shifts. The 22 signal

23

times, computed from the starting point of each process monitoring, are listed in Table 2. The

average signal time is 13.818 years. Compared to the confidence interval approach described at

the beginning of this section, it can be seen that our DySS approach is more effective in detecting

irregular longitudinal patterns of the total cholesterol levels.

Table 2: Signal times (years) of the 22 stroke patients by our proposed CUSUM chart (14)-(15).

Patient ID Signal time Patient ID Signal time

2 12 16 193 13 17 124 12 18 85 22 19 126 11 20 167 8 21 89 23 22 911 24 23 1112 8 24 1613 7 25 2615 8 27 19

5 Concluding Remarks

We have described the DySS screening system for dynamically identifying individuals with

irregular longitudinal patterns. This approach tries to make use of all available information when

monitoring the longitudinal pattern of an individual, including the information from an IC popula-

tion of certain well-functioning individuals and all history data of the individual being monitored,

using both longitudinal data analysis and SPC techniques. Our numerical examples have shown

that it is effective in various cases. Although this paper focuses on the monitoring of the mean

function of the longitudinal response, similar DySS screening systems can be developed for mon-

itoring the variance function and other features of the response that may affect the longitudinal

performance of the individuals in a specific application. This approach can also be generalized to

cases when multiple responses are present.

In this paper, we have discussed cases when the observation times are regularly spaced, ob-

servation times are irregularly spaced, and the observations at different time points are correlated.

The cases when observation times are irregularly spaced, the observations at different time points

are correlated, and the correlation needs to be described by a parametric time series model are

24

j

Cj+

0

1

2

3

4

5

1 2 3 4 5 6

patient 19 patient 20

1 2 3 4 5 6


1 2 3 4 5 6


1 2 3 4 5 6


1 2 3 4 5 6

patient 27

patient 10 patient 11 patient 12 patient 13 patient 14 patient 15 patient 16 patient 17

0

1

2

3

4

5

patient 180

1

2

3

4

5

patient 1

1 2 3 4 5 6

patient 2 patient 3

1 2 3 4 5 6

patient 4 patient 5

1 2 3 4 5 6

patient 6 patient 7

1 2 3 4 5 6

patient 8 patient 9

Figure 4: CUSUM charts (14)-(15) with (k, h,ATS0) = (0.1, 0.927, 25) for monitoring the 27 strokepatients. The dashed horizontal lines denote the control limit ρ.

especially challenging, because there is little existing discussion in the literature about time series

modeling in such cases. In the paper, only the case when the AR(1) model (11) is appropriate

is discussed. More future research is required to handle cases with more flexible parametric time

series models.

Supplementary Materials

supplemental.pdf: This pdf file discusses some statistical properties of the estimator µ̂(t; Ṽ )

defined in Section 2.1, some technical details about the adjustment procedure (15) in Li

(2011) and the CV procedure discussed in Section 2.1, and some extra numerical results.

ComputerCodesAndData.zip: This zip file contains MATLAB source codes of our proposed

DySS method and the stroke data used in the paper.

25

Acknowledgments: We thank the editor, the associate editor, and two referees for many

constructive comments and suggestions, which greatly improved the quality of the paper. This

research is supported in part by an NSF grant and two grants from Natural Sciences Foundation

of China.

References

Apley, D.W., and Tsung, F. (2002), “The autoregressive T 2 chart for monitoring univariate auto-

correlated processes,” Journal of Quality Technology, 34, 80–96.

Brockwell, P., Davis, R., and Yang, Y. (2007), “Continuous-time Gaussian autoregression,” Sta-

tistica Sinica, 17, 63–80.

Capizzi, G., and Masarotto, G. (2003), “An adaptive exponentially weighted moving average

control chart,” Technometrics, 45, 199–207.

Chen, K., and Jin, Z. (2005), “Local polynomial regression analysis of clustered data,” Biometrika,

92, 59–74.

Cupples, L.A. et al. (2007), The Framingham Heart Study 100K SNP genome-wide association

study resource: overview of 17 phenotype working group reports,” BMC Medical Genetics,

8, (Suppl 1): S1.

Ding, Y., Zeng, L., and Zhou, S. (2006), “Phase I analysis for monitoring nonlinear profiles in

manufacturing processes,” Journal of Quality Technology, 38, 199–216.

Gan, F.F. (1995), “Joint monitoring of process mean and variance using exponentially weighted

moving average control charts,” Technometrics, 37, 446–453.

Han, D., and Tsung, F. (2006), “A reference-free cuscore chart for dynamic mean change detection

and a unified framework for charting performance comparison,” Journal of the American

Statistical Association, 101, 368–386.

Hawkins, D.M., and Deng, Q. (2010), “A nonparametric change-point control chart,” Journal of

Quality Technology, 42, 165–173.

26

Hawkins, D.M., and Olwell, D.H. (1998), Cumulative Sum Charts and Charting for Quality Im-

provement, New York: Springer-Verlag.

Jensen, W.A., and Birch, J.B. (2009), “Profile monitoring via nonlinear mixed models,” Journal

of Quality Technology, 41, 18–34.

Jiang, W. (2004), “Multivariate control charts for monitoring autocorrelated processes,” Journal

of Quality Technology, 36, 367–379.

Jiang, W., Tsui, K.L., and Woodall, W. (2000), “A new SPC monitoring method: the ARMA

chart,” Technometrics, 42, 399–410.

Knoth, S. (2011), “spc: Statistical Process Control,” R package version 0.4.0, http://CRAN.R-

project.org/package=spc.

Lahiri, S.N. (2003), Resampling Methods for Dependent Data, New York: Springer.

Li, Y. (2011), “Efficient semiparametric regression for longitudinal data with nonparametric co-

variance estimation,” Biometrika, 98, 355–370.

Lu, C.W., and Reynolds, M.R., Jr. (2001), “Control charts for monitoring an autocorrelated

process,” Journal of Quality Technology, 33, 316–334.

Ma, S., Yang, L., and Carroll, R. (2012), “A simultaneous confidence band for sparse longitudinal

regression,” Statistica Sinica, 22, 95–122.

Maller, R.A., Müller, G., and Szimayer, A. (2008), “GARCH modelling in continuous time for

irregularly spaced time series data,” Bernoulli, 14, 519–542.

Montgomery, D. C. (2009), Introduction To Statistical Quality Control (6th edition), New York:

John Wiley & Sons.

Moustakides, G.V. (1986), “Optimal stopping times for detecting changes in distributions,” The

Annals of Statistics, 14, 1379–1387.

Pan, J., Ye, H., and Li, R. (2009), “Nonparametric regression of covariance structures in lon-

gitudinal studies,” Technical Report, 2009.46, School of Mathematics, The University of

Manchester, UK.

Qiu, P. (2013), Introduction to Statistical Process Control, London: Chapman & Hall/CRC.

27

Qiu, P. (2008), “Distribution-free multivariate process control based on log-linear modeling,” IIE

Transactions, 40, 664–677.

Qiu, P., and Hawkins, D. (2001), “A rank based multivariate CUSUM procedure,” Technometrics,

43, 120–132.

Qiu, P., and Li, Z. (2011a), “On nonparametric statistical process control of univariate processes,”

Technometrics, 53, 390–405.

Qiu, P., and Li, Z. (2011b), “Distribution-free monitoring of univariate processes,” Statistics and

Probability Letters, 81, 1833–1840.

Qiu, P., Zou, C., and Wang, Z. (2010), “Nonparametric profile monitoring by mixed effects mod-

eling (with discussions),” Technometrics, 52, 265–277.

Reynolds, M.R., Jr., Amin, R.W., and Arnold, J.C. (1990), “CUSUM charts with variable sam-

pling intervals,” Technometrics, 32, 371–384.

Shu, L., and Jiang, W. (2006), “A Markov chain model for the adaptive cusum control chart,”

Journal of Quality Technology, 38, 135–147.

Sparks, R.S. (2000), “CUSUM charts for signalling varying location shifts,” Journal of Quality

Technology, 32, 157–171.

Vityazev, V.V. (1996), “Time series analysis of unequally spaced data: the statistical properties

of the Schuster periodogram,” Astronomical and Astrophysical Transactions, 11, 159–173.

Yao, F., Müller, H.G., and Wang, J.L. (2005), “Functional data analysis for sparse longitudinal

data,” Journal of the American Statistical Association, 100, 577–590.

Yeh, A.B., Lin, D.K.J., and Venkataramani, C. (2004), “Unified CUSUM charts for monitoring

process mean and variability,” Quality Technology and Quantitative Management, 1, 65–86.

Zhao, Z., and Wu, W. (2008), “Confidence bands in nonparametric time series regression,” Annals

of Statistics, 36, 1854–1878.

Zou, C., and Tsung, F. (2010), “Likelihood ratio-based distribution-free EWMA control charts,”

Journal of Quality Technology, 42, 1–23.

28

IntroductionThe DySS MethodEstimation of the regular longitudinal patternSPC charts for dynamically identifying irregular individualsCases when the longitudinal observations follow AR(1) modelCases with arbitrary autocorrelation

Numerical StudyApplication to the Total Cholesterol Level ExampleConcluding Remarks

Univariate Dynamic Screening System: An Approach For ...Univariate Dynamic Screening System: An...

Documents

Transcript of Univariate Dynamic Screening System: An Approach For ...Univariate Dynamic Screening System: An...