Univariate Dynamic Screening System: An Approach For ...Univariate Dynamic Screening System: An...
Transcript of Univariate Dynamic Screening System: An Approach For ...Univariate Dynamic Screening System: An...
-
Univariate Dynamic Screening System: An Approach For Identifying
Individuals With Irregular Longitudinal Behavior
Peihua Qiu1 and Dongdong Xiang2
1School of Statistics, University of Minnesota
2School of Finance and Statistics, East China Normal University
Abstract
In our daily life, we often need to identify individuals whose longitudinal behavior is different
from the behavior of those well-functioning individuals, so that some unpleasant consequences
can be avoided. In many such applications, observations of a given individual are obtained se-
quentially, and it is desirable to have a screening system to give a signal of irregular behavior as
soon as possible after that individual’s longitudinal behavior starts to deviate from the regular
behavior, so that some adjustments or interventions can be made in a timely manner. This paper
proposes a dynamic screening system for that purpose in cases when the longitudinal behavior
is univariate, using statistical process control and longitudinal data analysis techniques. Sev-
eral different cases, including those with regularly spaced observation times, irregularly spaced
observation times, and correlated observations, are discussed. Our proposed method is demon-
strated using a real-data example about the SHARe Framingham Heart Study of the National
Heart, Lung and Blood Institute. This paper has supplementary materials online.
Key Words: Correlation; Dynamic screening; Longitudinal data; Process monitoring; Process
screening; Standardization; Statistical process control; Unequal sampling intervals.
1 Introduction
Most products (e.g., airplanes, cars, batteries of notebook computers) need to be checked
regularly or occasionally about certain variables related to their quality and/or performance. If
the observed values of the variables of a given product are significantly worse than the values of a
typical well-functioning product of the same age, then some proper adjustments or interventions
should be made to avoid any unpleasant consequences. This paper suggests a dynamic screening
system for identifying such improperly functioning products effectively.
1
-
The motivating example of this research is the SHARe Framingham Heart Study of the National
Heart, Lung and Blood Institute. In the study, 1055 patients were recruited, each patient was
followed 7 times, and the total cholesterol level of each patient was recorded during each of the 7
clinic visits. Among the 1055 patients, 1028 of them did not have any strokes during the study, and
the remaining 27 patients had at least one stroke. One major research goal of the study is to identify
patients with irregular total cholesterol level patterns as early as possible, so that some medical
treatments can be taken in a timely manner to avoid the occurrence of stroke. To this end, one
traditional method is to construct pointwise confidence intervals for the mean total cholesterol level
of the non-stroke patient population across different ages, using the longitudinal data of all non-
stroke patients. Then, a patient can be identified as the one who has an irregular total cholesterol
level pattern if his/her total cholesterol levels go beyond the confidence intervals. In the literature,
there have been some discussions about the construction of such confidence intervals. See, for
instance, Ma et al. (2012), Yao et al. (2005), and Zhao and Wu (2008).
The confidence interval methods mentioned above use a cross-sectional comparison approach.
In the context of the total cholesterol level example, they compare patients’ total cholesterol levels
at the cross-section of a given age. One limitation of this approach is that it does not make use of
the medical history of a given patient in the medical diagnostics. If the total cholesterol levels at
different ages of a given patient are regarded as observations of a random process, then intuitively
we should make use of all such observations of the process for medical diagnostics. If a patient’s
total cholesterol level is consistently above the mean total cholesterol level of the non-stroke patient
population for an enough long time period, then that patient should still be identified as a person
who has an irregular total cholesterol level pattern, even if his/her total cholesterol level at any
given time point is within the confidence interval of the mean total cholesterol level of the non-stroke
patient population. Further, it is critical that the diagnostic method should be dynamic in the sense
that a decision about a patient can be made at any time during the observation process as long as
all observations up to the current time point have provided enough evidence for the decision. That
is because the major purpose to monitor a patient’s total cholesterol level is to find any irregular
pattern as soon as possible so that some necessary medical treatments can be given in a timely
manner. The confidence interval methods mentioned in the previous paragraph usually do not
have such a dynamic decision-making property, because they do not follow a patient sequentially
over time. In the literature, some statistical process control (SPC) methodologies, including the
2
-
cumulative sum (CUSUM) and the exponentially weighted moving average (EWMA) charts, are
designed for such purposes (e.g., Hawkins and Olwell 1998, Montgomery 2009, Qiu 2013). These
control charts evaluate the performance of a process sequentially based on all observed data up
to the current time point. However, they cannot be applied to the current problem directly for
the following several reasons. First, most SPC methods are for monitoring a single process (e.g.,
a production line, a medical system). When the process performs satisfactorily, its observations
follow a so-called in-control (IC) distribution and the process is said to be IC. The major purpose
of the SPC control charts is to detect any distributional shift of the process observations as soon as
possible so that the related process can be stopped promptly for adjustments and repairs. In the
current problem, if each patient is regarded as a process, then there are many processes involved. To
judge whether a patient’s total cholesterol level is IC, we need to take the total cholesterol levels of
all non-stroke patients into consideration. A conventional SPC control chart is not designed for that
purpose. Second, in a typical SPC problem, if we are interested in monitoring the process mean,
then the process mean is assumed to be a constant when the process is IC. In the current problem,
however, the mean total cholesterol level of the non-stroke patients would change over time. Because
of these fundamental differences between the conventional SPC problem and the current problem,
the existing SPC methodologies cannot be applied directly to the current problem.
In this paper, we focus on cases when the variable to monitor (e.g., the total cholesterol level)
is univariate and we are interested in monitoring its mean. In such cases, we propose a dynamic
screening system (DySS) to dynamically identify individuals with irregular longitudinal patterns. In
the DySS method, the regular longitudinal pattern is first estimated from the observed longitudinal
data of a group of well-functioning individuals (e.g., the non-stroke patients in the stroke example).
Then, the estimated regular longitudinal pattern is used for standardizing the observations of an
individual to monitor. Finally, an SPC control chart is applied to the standardized observations of
that individual for online monitoring of its longitudinal behavior.
It should be pointed out that the DySS problem described above looks like a profile monitoring
problem in SPC. But, the two problems are actually completely different. In the profile monitoring
problem, observations obtained from a sampled product are a profile that describes the relationship
between a response variable and one or more explanatory variables, the profiles from different
sampled products are usually ordered by observation times or spatial locations of the sampled
products, and the major goal of the profile monitoring problem is for detecting any shift in the
3
-
mean and/or variance of the profiles over time/location (cf., e.g., Ding et al. 2006, Jensen and
Birch 2009, Qiu et al. 2010). As a comparison, in the DySS problem, although observations of each
individual look like a profile, each individual is actually treated as a separate process and we are
mainly interested in monitoring the longitudinal behavior of each individual. Different individuals
are often not ordered by time or location.
The remainder of the article is organized as follows. Our proposed DySS methodology is
described in detail in Section 2. A simulation study for evaluating its numerical performance is
presented in Section 3. Our proposed method is demonstrated using the motivating example about
patients’ total cholesterol levels in Section 4. Some concluding remarks are given in Section 5.
Theoretical properties of the estimated IC mean function of the longitudinal response are discussed
in a supplementary file, along with certain technical details and numerical results.
2 The DySS Method
Our proposed DySS method is described in four parts. In the first part, estimation of the regular
longitudinal pattern from the observed longitudinal data of a group of well-functioning individuals
is discussed. Then, an SPC chart is proposed in the second part for monitoring individuals dy-
namically, after their observations are standardized by the estimated regular longitudinal pattern.
In this part, longitudinal observations are assumed to be independent and normally distributed.
Situations when they are normally distributed and autocorrelated and the autocorrelation can be
described by an AR(1) time series model are discussed in the third part. Finally, general cases when
observations are autocorrelated with an arbitrary autocorrelation and when their distribution is
non-normal are discussed in the fourth part.
2.1 Estimation of the regular longitudinal pattern
Assume that y is the variable that we are interested in monitoring, and observations of y for a
group of m well-functioning individuals follow the model
y(tij) = µ(tij) + ε(tij), for i = 1, 2, . . . ,m, j = 1, 2, . . . , J, (1)
where tij is the jth observation time of the ith individual, y(tij) is the observed value of y at
tij , µ(tij) is the mean of y(tij), and ε(tij) is the error term. For simplicity, we assume that the
4
-
design space is [0, 1]. As in the literature of longitudinal data analysis (e.g., Li 2011), we further
assume that the error term ε(t), for any t ∈ [0, 1], consists of two independent components, i.e.,
ε(t) = ε0(t) + ε1(t), where ε0(·) is a random process with mean 0 and covariance function V0(s, t),
for any s, t ∈ [0, 1], and ε1(·) is a noise process satisfying the condition that ε1(s) and ε1(t) are
independent for any s, t ∈ [0, 1]. In this decomposition, ε1(t) denotes the pure measurement error,
and ε0(t) denotes all possible covariates that may affect y but are not included in model (1). In
such cases, for any s, t ∈ [0, 1], the covariance function of ε(·) is
V (s, t) = Cov (ε(s), ε(t)) = V0(s, t) + σ21(s)I(s = t), for any s, t ∈ [0, 1], (2)
where σ21(s) = Var(ε1(s)), and I(s = t) = 1 when s = t and 0 otherwise. In model (1), observations
of different individuals are assumed to be independent. As a sidenote, model (1) is similar to the
nonparametric mixed-effects model discussed in Qiu et al. (2010). Their major difference is that
the variance of the pure measurement error ε1(t) in (1) is allowed to vary over t, while such a
variance is assumed to be a constant in Qiu et al. (2010). Also, J is assumed to be a constant in
(1) for simplicity. Our method discussed in the paper can be generalized to cases when J depends
on i without much difficulty.
For the longitudinal model specified by (1) and (2), Chen et al. (2005) and Pan et al. (2009)
propose the following estimator of µ(t) based on the local pth-order polynomial kernel smoothing
procedure:
µ̂(t;V ) = e′1
(m∑
i=1
X ′iV−1i Xi
)−1( m∑
i=1
X ′iV−1i yi
), (3)
where e1 is a (p+ 1)-dimensional vector with 1 at the first component and 0 anywhere else,
Xi =
1 (ti1 − t) . . . (ti1 − t)p
......
. . ....
1 (tiJ − t) . . . (tiJ − t)p
J×(p+1)
,
V −1i = K1/2ih (t)(IiΣiIi)
−1K1/2ih (t), Kih(t) = diag{K((ti1 − t)/h), . . . ,Kh((tiJ − t)/h)}/h, K is a
density kernel function, h > 0 is a bandwidth, Ii = diag{I(|ti1 − t| ≤ 1), . . . , I(|tiJ − t| ≤ 1)}, and
Σi is the covariance matrix of yi = (y(ti1), y(ti2), . . . , y(tiJ))′ which can be computed from V (s, t)
in (2). Throughout this paper, the inverse of a matrix refers to the Moore-Penrose generalized
inverse, which always exists.
In practice, the error covariance function V (s, t) is usually unknown, and it needs to be es-
timated from the observed data. To this end, we propose a method consisting of several steps,
5
-
described below. First, we compute an initial estimator of µ(t), denoted as µ̃(t), by (3) in which
all Σi are replaced by the identity matrix. Obviously, µ̃(t) is just the local pth-order polynomial
kernel estimator based on the assumption that the error terms in (1) at different time points are
i.i.d. for each individual subject. Second, we define residuals
ε̃(tij) = y(tij)− µ̃(tij), for i = 1, 2, . . . ,m, j = 1, 2, . . . , J.
Third, when s 6= t, we use the method originally proposed by Li (2011) to first estimate V0(s, t) in
(2) by
Ṽ0(s, t) = (A1(s, t)V00(s, t)−A2(s, t)V10(s, t)−A3(s, t)V01(s, t))B−1(s, t), (4)
whereA1(s, t) = S20(s, t)S02(s, t)−S211(s, t), A2(s, t) = S10(s, t)S02(s, t)−S01(s, t)S11(s, t), A3(s, t) =
S01(s, t)S20(s, t)− S10(s, t)S11(s, t), B(s, t) = A1(s, t)S00(s, t)−A2(s, t)S10(s, t)−A3(s, t)S01(s, t),
Sl1l2(s, t) =1
mJ(J − 1)
m∑
i=1
J∑
j=1
∑
j 6=j′
(tij − s
h
)l1 ( tij′ − th
)l2Kh(tij − s)Kh(tij′ − t),
Vl1l2(s, t) =1
mJ(J − 1)
m∑
i=1
J∑
j=1
∑
j 6=j′
ε̃(tij)ε̃(tij′ )
(tij − s
h
)l1 ( tij′ − th
)l2Kh(tij − s)Kh(tij′ − t),
Kh(tij − s) = K((tij − s)/h)/h, and l1, l2 = 0, 1, 2. Note that the matrix Ṽ0(s, t) in (4) may not be
semipositive definite. To address this issue, the adjustment procedure (15) in Li (2011) can effec-
tively regularize it to be a well defined covariance matrix, which is described in the supplementary
file. Then, the variance of y(t), denoted as σ2y(t) = V0(t, t) + σ21(t), can be regarded as the mean
function of ε2(t), and it can be estimated from the new data {ε̃2(tij), i = 1, 2, . . . ,m, j = 1, 2, . . . , J}
by (3), in which yi are replaced by (ε̃2(ti1), . . . , ε̃
2(tiJ))′ and Σi are replaced by the identity matrix.
The resulting estimator is denoted as σ̃2y(t). Finally, we define the estimator of V (s, t) by
Ṽ (s, t) = Ṽ0(s, t)I(s 6= t) + σ̃2y(t)I(s = t). (5)
Consequently, the mean function µ(t) can be estimated by µ̂(t; Ṽ ).
Note that, in the four steps described above for computing µ̃(t), Ṽ0(s, t), σ̃2y(t), and µ̂(t; Ṽ ), the
bandwidth h could be chosen differently in each step. We did not distinguish them in the above
description for simplicity. In all our numerical examples presented later, they are allowed to be
different, and each of them is chosen separately and sequentially by the conventional cross-validation
(CV) procedure (cf., the description in the supplementary file) as follows. First, the bandwidth
for µ̃(t) is selected. Then, the bandwidths for Ṽ0(s, t) and σ̃2y(t) are selected, respectively. Finally,
6
-
the bandwidth for µ̂(t; Ṽ ) can be selected. Some theoretical properties of µ̂(t; Ṽ ) are given in a
proposition of the supplementary file, which shows that µ̂(t; Ṽ ) is statistically consistent under
some regularity conditions.
By the estimation procedure described above, we can compute the local polynomial kernel
estimator µ̂(t; Ṽ ) of the mean function µ(t) in cases when the error covariance function V (s, t) is
unknown. After µ(t) is estimated, the variance function σ2y(t) can be estimated in the same way,
except that the original observations {y(tij)} need to be replaced by
ε̂2(tij) = (y(tij)− µ̂(tij ; Ṽ ))2, for i = 1, 2, . . . ,m, j = 1, 2, . . . , J.
The resulting estimator of σ2y(t) is denoted as σ̂2y(t).
The models (1) and (2) are quite general and they include a variety of different situations as
special cases. In certain applications, the variance structure of the observed data can be described
by a simpler model. For instance, in the SPC literature, we often consider cases when the error
terms at different time points are independent of each other or they follow a specific time series
model. In the current setup, the first scenario corresponds to the case when V0(s, t) = 0, for any
s, t ∈ [0, 1]. In such cases, µ(t) and σ2y(t) can be estimated in the same way as described above,
except that we do not need to estimate V0(s, t).
2.2 SPC charts for dynamically identifying irregular individuals
In the previous part, we define the estimators µ̂(t; Ṽ ) and σ̂2y(t) of the IC mean function µ(t)
and the IC variance function σ2y(t), based on the observed longitudinal data of m well-functioning
individuals which are sometimes called IC data for simplicity. In this part, we propose SPC charts
for sequentially monitoring the longitudinal behavior of a new individual, using the estimated
regular pattern of the longitudinal response variable y(t) that is described by µ̂(t; Ṽ ) and σ̂2y(t).
Assume that the new individual’s y values are observed at times t∗1, t∗2, . . . over the design
interval [0, 1]. In cases when that individual’s longitudinal behavior is IC, these y values should
also follow the model (1). For convenience of discussion, we re-write that model as
y(t∗j ) = µ(t∗j ) + σy(t
∗j )ǫ(t
∗j ), for j = 1, 2, . . . (6)
where σy(t)ǫ(t) here equals ε(t) in model (1). Thus, ǫ(t) in model (6) has mean 0 and variance 1
7
-
at each t. For the y values of the new individual, let us define their standardized values by
ǫ̂(t∗j ) =y(t∗j )− µ̂(t
∗j ; Ṽ )
σ̂y(t∗j ), for j = 1, 2, . . . (7)
In the case when {y(t∗j ), j = 1, 2, . . .} are independent of each other, and each of them has a normal
distribution, then {ǫ̂(t∗j ), j = 1, 2, . . .} would be a sequence of asymptotically i.i.d. random variables
with a common asymptotic distribution of N(0, 1). If we further assume that {t∗j , j = 1, 2, . . .}
are equally spaced, then {ǫ̂(t∗j ), j = 1, 2, . . .} can be monitored by a conventional control chart,
such as a CUSUM or EWMA chart, as usual. For instance, to detect upward mean shifts in
{ǫ̂(t∗j ), j = 1, 2, . . .}, the charting statistic of the upward CUSUM chart can be defined by
C+j = max(0, C+j−1 + ǫ̂(t
∗j )− k
), for j ≥ 1, (8)
where C+0 = 0, and k > 0 is the allowance constant. The chart gives a signal of an upward mean
shift if
C+j > ρ, (9)
where ρ > 0 is a control limit. To evaluate the performance of the CUSUM chart (8)-(9), we can
use the IC average run length (ARL), denoted as ARL0, which is the average number of time points
from the beginning of process monitoring to the signal time under the condition that the process
is IC, and the out-of-control (OC) ARL, denoted as ARL1, which is the average number of time
points from the occurrence of a shift to the signal time under the condition that the process is OC.
Usually, the allowance k is specified beforehand. It has been well demonstrated in the literature
that large k values are good for detecting large shifts and small k values are good for detecting
small shifts. If it is possible to know the shift size δ in the mean of ǫ̂, then the optimal k is δ/2 (cf.,
Moustakides 1986). The control limit ρ is chosen such that a pre-specified ARL0 value is reached.
Then, the chart performs better for detecting a given shift if its ARL1 value is smaller. For some
commonly used k and ARL0 values, the corresponding ρ values can be found from Table 3.2 in
Hawkins and Olwell (1998). They can also be computed easily by some software packages, such as
the R-package spc (Knoth 2011).
In practice, however, the observation times {t∗j , j = 1, 2, . . .} may not be equally spaced, as
in the total cholesterol level example discussed in Section 1. In such cases, ARL0 and ARL1 are
obviously inappropriate for measuring the performance of the CUSUM chart (8)-(9), and we need to
define new performance measures. To this end, let ω > 0 be a basic time unit in a given application,
8
-
which is the largest time unit that all unequally spaced observation times are its integer multiples.
Define
n∗j = t∗j/ω, for j = 0, 1, 2, . . .
where n∗0 = t∗0 = 0. Then, t
∗j = n
∗jω, for all j, and n
∗j is the jth observation time in the basic time
unit. In cases when the new individual is IC and the CUSUM chart (8)-(9) gives a signal at the sth
observation time, then n∗s is a random variable measuring the time to a false signal. Its mean E(n∗s)
measures the IC average time to the signal (ATS), denoted as ATS0. If the new individual has an
upward mean shift starting at the τth observation time and the CUSUM chart (8)-(9) gives a signal
at the sth time point with s ≥ τ , then the mean of n∗s−n∗τ is called the OC ATS, denoted as ATS1.
Then, to measure the performance of a control chart (e.g., the chart (8)-(9)), its ATS0 value can be
fixed at a certain level beforehand, and the chart performs better if its ATS1 value is smaller when
detecting a shift of a given size. It is obvious that the values of ATS0 and ATS1 are just constant
multiples of the corresponding values of ARL0 and ARL1 in cases when the observation times are
equally spaced. In such cases, the two sets of measures are equivalent. Also, when the observation
times are unequally spaced, it is possible to use the IC average number of observations to signal
(ANOS), denoted as ANOS0, and the OC average number of observations to signal, denoted as
ANOS1, for measuring the performance of control charts. However, these quantities are also just
constant multiples of ATS0 and ATS1, in cases when the sampling distribution does not change
over time. Furthermore, when the sampling distribution changes over time, ANOS0 and ANOS1
may not be able to measure the chart performance well. As a demonstration, let us consider two
scenarios. In the first scenario, we collect observations at the following basic time units: 1, 3, 7,
. . . , and a signal by a chart is given at the third observation time. In the second scenario, we collect
observations at the following basic time units: 1, 30, 100, . . . , and a signal by another chart is also
given at the third observation time. Assume that the processes are IC in both scenarios. Then, the
IC performance of the two charts are the same if ANOS0 is used as a performance measure. But,
the second chart does not give a false signal until the 100th basic time unit, while the first chart
gives a false signal at the 7th basic time unit. Obviously, ANOS0 cannot tell this quite dramatic
difference. For all these reasons, only ATS0 and ATS1 are used in the remaining part of the paper
for measuring the performance of a control chart.
If the IC mean function µ(t) and the IC variance function σ2y(t) are known, then the standard-
9
-
ized observations of the new individual can be defined by
ǫ(t∗j ) =y(t∗j )− µ(t
∗j )
σy(t∗j ), for j = 1, 2, . . . (10)
In such cases, as long as the distribution of the observation times {t∗j , j = 1, 2, . . .} is specified
properly, for a given k value in (8) and a given ATS0 value, we can easily compute the corresponding
value of the control limit ρ by simulation such that the pre-specified ATS0 value is reached. For
instance, when the distribution of the observation times is specified by the sampling rate d, which is
defined to be the number of observation times every 10 basic time units here, the computed ρ values
in cases when ATS0 = 20, 50, 100, 150, 200, k = 0.1, 0.2, 0.5, 0.75, 1.0, and d = 2, 5, 10 are presented
in Table 1. From the table, it can be seen that ρ increases with ATS0 and d, and decreases with
k. In cases when µ(t) and σ2y(t) are unknown, they need to be estimated from an IC dataset by
µ̂(t; Ṽ ) and σ̂2y(t), as discussed in Section 2.1. In such cases, the standardized observations of the
new individual defined by (7) are only asymptotically N(0, 1) distributed when the new individual
is IC. When the IC sample size is moderate to large, we can still use the control limit ρ computed
in the case of (10), which will be justified by some numerical results described in the next section.
Table 1: Control limits of the CUSUM chart (8)-(9) for known IC mean and variance functions andvarious combinations of ATS0, k, and d described in the text. For the entry with a “∗” symbol,the actual ATS0 value is more than 1% away from the assumed ATS0 value. For all other entries,the actual ATS0 values are within 1% of the assumed ATS0 values.
d k ATS0 = 20 ATS0 = 50 ATS0 = 100 ATS0 = 150 ATS0 = 200
0.1 0.929 1.844 2.820 3.552 4.1710.2 0.774 1.571 2.351 2.928 3.396
2 0.5 0.382 0.986 1.493 1.822 2.0880.75 0.118 0.654 1.064 1.312 1.5071.0 0.001∗ 0.355 0.718 0.936 1.101
0.1 1.844 3.215 4.612 5.631 6.4450.2 1.571 2.664 3.713 4.381 5.005
5 0.5 0.986 1.685 2.234 2.586 2.8980.75 0.654 1.200 1.612 1.869 2.0721.0 0.355 0.837 1.189 1.404 1.562
0.1 2.820 4.612 6.407 7.649 8.6120.2 2.351 3.713 4.937 5.744 6.408
10 0.5 1.493 2.235 2.852 3.245 3.5420.75 1.061 1.610 2.039 2.310 2.5091.0 0.718 1.192 1.538 1.738 1.894
In the SPC literature, the concept of ATS has been proposed for measuring the performance of
10
-
a control chart with variable sampling intervals (VSIs). See, for instance, Reynolds et al. (1990).
However, the VSI problem in SPC is completely different from the DySS problem discussed here. In
the VSI problem, the next observation time is determined by the current and all past observations
of the process; the next observation is collected sooner if there is more evidence of a shift based on
all available process observations, and later otherwise. Therefore, the VSI scheme is an integrated
part of process monitoring. As a comparison, in the DySS problem, observation times are often
not specifically chosen for improving process monitoring. In the total cholesterol level example
discussed in Section 1, for instance, patients often choose their clinic visit times based on their
convenience or health conditions.
At the end of this subsection, we would like to make several remarks about the DySS problem
and the CUSUM chart (8)-(9). First, the above description focuses on cases to detect upward
shifts in the process mean function µ(t) only. Control charts for detecting downward mean shifts or
arbitrary mean shifts can be developed in the same way, except that the upward charting statistic
in (8) and the decision rule in (9) should be changed to the downward or two-sided version (cf., Qiu
2013, Chapter 4). Second, control charts for detecting shifts in the process variance function σ2y(t),
or shifts in both µ(t) and σ2y(t), can be developed in a similar way, after (8) and (9) are replaced
by appropriate charting statistics and the related decision rules. See, for instance, Gan (1995) and
Yeh et al. (2004) for discussions about the corresponding problems in the conventional SPC setup.
Third, although a CUSUM chart is discussed above, EWMA charts and charts based on change-
point detection (CPD) can also be used in the DySS problem. Generally speaking, the CUSUM
and EWMA charts have similar performance, and the CPD charts can provide estimates of the shift
position at the time when they give signals although their computation is usually more extensive.
See Qiu (2013) for a related discussion. Fourth, to use the proposed DySS method, the observation
times {t∗j} of the new individual to monitor usually cannot be larger than the largest value of the
observation times {tij} in the IC data, to avoid extrapolation in the data standardization in (7).
Or, it is inappropriate to use our DySS method to monitor the new individual outside the time
range
[mini,j
tij , maxi,j
tij ].
From the description about the DySS method above, it can be seen that this methodology actually
combines the cross-sectional comparison between the new individual to monitor and the m well-
functioning individuals in the IC data with the sequential monitoring of the new individual in
11
-
a dynamic manner. Therefore, to use the method to monitor the longitudinal behavior of the
new individual at time t, there should exist observations around t in the IC data to make the
cross-sectional comparison possible. Fifth, although the simplest CUSUM chart (8)-(9) is used
in this paper to demonstrate the DySS method, some existing research in the SPC literature to
accommodate correlated data (e.g., Apley and Tsung 2002, Jiang 2004) or to detect dynamic mean
shifts (e.g., Han and Tsung 2006, Shu and Jiang 2006) might also be considered here. In the
numerical study presented in Section 3, the control charts by Han and Tsung (2006) and Shu and
Jiang (2006) will be discussed.
2.3 Cases when the longitudinal observations follow AR(1) model
In the above discussion, we assume that the longitudinal observations {y(t∗j ), j = 1, 2, . . .}
are independent. In practice, they are often correlated. In the SPC literature, time series models,
especially the AR(1) model, are commonly used for describing the autocorrelation in such cases (cf.,
Jiang et al. 2000, Lu and Reynolds 2001). In this section we discuss how to use the proposed DySS
procedure in cases when the longitudinal observations follow an AR(1) model. In a conventional
time series model, observation times are usually equally spaced. In the current DySS problem,
however, they can be unequally spaced. In such cases, time series modeling is especially challenging
(cf., Maller et al. 2008, Vityazev 1996), and our DySS approach with a more general time series
model is not available yet, which is left for our future research.
Let us first discuss estimation of the regular longitudinal pattern from an IC dataset when
the longitudinal data are correlated and follow an AR(1) model. Assume that there are m well-
functioning individuals whose longitudinal observations follow the model (1) in which ε(tij) =
σy(tij)ǫ(tij) and ǫ(tij) has mean 0 and variance 1 for each tij ∈ [0, 1]. Instead of assuming indepen-
dence, in this subsection, we assume that ǫ(tij) follow the AR(1) model
ǫ(tij) = φǫ(tij − ω) + e(tij), for i = 1, ...,m, j = 1, ..., J, (11)
where φ is a coefficient, ω is the basic time unit defined in Section 2.2, and e(t) is a white noise
process over [0, 1]. The model (11) is the conventional AR(1) model for cases with equally spaced
observation times. It does not have a constant term on the right-hand-side because the mean of
ǫ(tij) is 0. In the current DySS problem, it is inconvenient to work with the model (11) directly
because there may not be observations at the times tij − ω that are used in (11). To overcome
12
-
this difficulty, we can transform the AR(1) model (11) to the following time series model using the
actual observation times {tij}:
ǫ(tij) = φ∆i,j−1ǫ(ti,j−1) + Θ̃ij(z)e(tij), (12)
where ∆i,j−1 = (tij − ti,j−1)/ω, z is the lag operator used in time series modeling (e.g., zǫ(t) =
ǫ(t− ω)), and
Θ̃ij(z) = 1 + φz + · · ·+ φ∆i,j−1−1z∆i,j−1−1
is a lag polynomial. As in Sections 2.1 and 2.2, let µ̂(t; Ṽ ) and σ̂2y(t) be the local pth-order
polynomial kernel estimators of µ(t) and σ2y(t), and
ǫ̂(tij) =y(tij)− µ̂(tij ; Ṽ )
σ̂y(tij), for i = 1, 2, . . . ,m, j = 1, 2, . . . , J.
Then, the least squares (LS) estimator of φ, denoted as φ̂, can be obtained by minimizing
m∑
i=1
J∑
j=2
{ǫ̂(tij)− φ
∆i,j−1 ǫ̂(ti,j−1)}2
.
Next, we use the estimated IC longitudinal pattern described by µ̂(t; Ṽ ), σ̂2y(t) and φ̂ to monitor
a new individual’s longitudinal pattern. Assume that the new individual’s y observations follow
the model (6), in which the error term ǫ(t∗j ) follows the AR(1) model
ǫ(t∗j ) = φǫ(t∗j − ω) + e(t
∗j ), for j = 1, 2, . . .
where φ is the same coefficient as the one in (11). Then, similar to the relationship between models
(11) and (12), the above AR(1) model implies that
ǫ(t∗j ) = φ∆∗j−1ǫ(t∗j−1) + e
∗j , (13)
where ∆∗j−1 = (t∗j − t
∗j−1)/ω, e
∗j = Θ̃
∗j (z)e(t
∗j ), and
Θ̃∗j (z) = 1 + φz + · · ·+ φ∆∗j−1−1z∆
∗
j−1−1.
It can be checked that, when the new individual is IC, {e∗j , j = 1, 2, . . .} are i.i.d. with mean 0 and
variance σ2e∗ = 1 − φ2∆∗j−1 . In the SPC literature, it has been well demonstrated that, when ǫ(t∗j )
follows the time series model (13), to detect an upward mean shift in ǫ(t∗j ), we can just monitor
{e∗j , j = 1, 2, . . .} (cf., e.g., Lu and Reynolds 2001). Based on all these results, to detect an upward
13
-
mean shift in the original response y, the charting statistic of the upward CUSUM chart can be
defined by
C+j = max
[0, C+j−1 +
(ǫ̂(t∗j )− φ̂
∆∗j−1 ǫ̂(t∗j−1))/√
1− φ̂2∆∗
j−1 − k
], for j ≥ 2, (14)
where C+1 = 0, k > 0 is an allowance constant, and ǫ̂(t∗j ) = (y(t
∗j ) − µ̂(t
∗j ; Ṽ ))/σ̂y(t
∗j ). The chart
gives a signal when
C+j > ρ, (15)
where ρ > 0 is a control limit chosen to achieve a given ATS0 value.
At the end of this subsection, we would like to point out that there actually are two different
ways to estimate µ(t), σ2y(t), and φ from an IC data when the AR(1) model (12) is involved.
Because the estimation procedure described in Section 2.1 does not require the error covariance
function V (s, t) to be known, the first way is to estimate µ(t) and σ2y(t) by the four steps described
in Section 2.1 without using any prior information about the error structure, and then estimate φ
by the LS procedure described above. Alternatively, in the case when we are sure that the AR(1)
model (12) is appropriate for describing the error structure, a parametric formula of V (s, t) can be
derived in terms of φ, denoted as Vφ(s, t). Then, µ(t) and σ2y(t) can be estimated by µ̂(t;Vφ) and
σ̂2y,φ(t), respectively, where σ̂2y,φ(t) is the same as σ̂
2y(t) except that µ̂(t;Vφ) instead of µ̂(t; Ṽ ) is
used in its construction. Then, φ can be estimated by the LS estimator defined by the solution of
minφ
m∑
i=1
J∑
j=2
{ǫ̂φ(tij)− φ
∆i,j−1 ǫ̂φ(ti,j−1)}2
,
where
ǫ̂φ(tij) =y(tij)− µ̂(tij ;Vφ)
σ̂y,φ(tij), for i = 1, 2, . . . ,m, j = 1, 2, . . . , J.
We have checked that results by these two approaches are close to each other as long as the IC
sample size is not small. In all simulation examples discussed in Section 3, the second approach is
used, while the first approach is used in the real data example discussed in Section 4 because we
are not sure whether the AR(1) model is appropriate at the beginning of data analysis.
2.4 Cases with arbitrary autocorrelation
In the previous two parts, we assume that the original observations {y(t∗j ), j = 1, 2, . . .} of
a new individual are either independent or correlated with the autocorrelation described by the
14
-
AR(1) model (11). Also, the observation distribution is assumed to be normal. In practice, all
these assumptions could be violated. If one or more such model assumptions are violated, then the
related control charts (8)-(9) and (14)-(15) may not be reliable because their actual ATS0 values
could be substantially different from the assumed ATS0 values. See related discussions in Hawkins
and Deng (2010), Qiu and Hawkins (2001), Qiu and Li (2011a,b), and Zou and Tsung (2010).
In this subsection, we propose a numerical approach to compute the control limit ρ of the chart
(8)-(9) from an IC data such that the chart is still appropriate to use in cases when within-subject
observations are correlated with an arbitrary autocorrelation and when a parametric form of their
distribution is unavailable.
Assume that there is an IC dataset consisting of a group of m well-functioning individuals
whose observations of y follow the model (1). The data of the first m1 individuals are used for
obtaining estimators µ̂(t; Ṽ ) and σ̂2y(t), as discussed in Section 2.1. Then, the control limit ρ of the
chart (8)-(9) can be computed from the remaining m2 = m−m1 individuals using a block bootstrap
procedure (e.g., Lahiri 2003) described below. First, we compute the standardized observations of
the m2 well-functioning individuals by
ǫ̂(tij) =y(tij)− µ̂(tij ; Ṽ )
σ̂y(tij), for i = m1 + 1,m1 + 2, . . . ,m, j = 1, 2, . . . , J.
Second, we randomly select B individuals with replacement from the m2 well-functioning individ-
uals, and use their standardized observations to compute the value of ρ by a numerical searching
algorithm so that a given ATS0 level is reached. Such a searching algorithm has been well described
in Qiu (2008) and Qiu (2013, Section 4.2). Basically, we first give an initial value to ρ, and compute
the actual ATS0 value of the chart (8)-(9) from the standardized observations of the B resampled
individuals. If the actual ATS0 value is smaller than the given ATS0 value, then the value of ρ
is increased; otherwise, the value of ρ is decreased. Then, the actual ATS0 value of the chart is
computed again, using the updated value of ρ. This process is repeated until the given ATS0 level
is reached to a certain accuracy. In all examples of this paper, we choose m1 = m/2.
It should be pointed out that, if it is reasonable to assume that the within-subject observations
are independent, then we can also draw bootstrap samples in the conventional way from the set
of residuals {ǫ̂(tij), i = m1 + 1,m1 + 2, . . . ,m, j = 1, 2, . . . , J}. Namely, we can randomly select J
residuals with replacement from this set as the standardized observations of a bootstrap resampled
“individual”, and B bootstrap resampled individuals can be generated in that way for computing
the value of ρ. When the within-subject observations are correlated, it has been demonstrated in
15
-
the literature that a block bootstrap procedure, such as the one described above, would be more
appropriate to use (cf., Lahiri 2003).
3 Numerical Study
In this section, we present some simulation results to investigate the numerical performance of
the proposed DySS procedure described in the previous section. In estimating the mean function
µ(t) and the variance function σ2y(t) of the longitudinal response y(t) from an IC data by the local
smoothing method described in Section 2.1, p is fixed at 1 (i.e., the local linear kernel smoothing is
used), the kernel function is chosen to be the Epanechnikov kernel functionK(t) = 0.75(1−t2)I(|t| ≤
1) that is widely used in the curve estimation literature, and all bandwidths are chosen by the CV
procedure.
In the first example, it is assumed that the IC mean function µ(t) and the IC variance function
σ2y(t) are known to be
µ(t) = 1 + 0.3t1
2 , σ2y(t) = µ2(t), for t ∈ [0, 1],
observations of a given individual at different time points are independent of each other, and the
sampling rate of different individuals are the same to be d observations every 10 basic time units. In
such cases, the standardized observations are defined in (10), and the control limit ρ of the CUSUM
chart (8)-(9) can be easily computed by a numerical algorithm, after its allowance constant k and
the ATS0 value are pre-specified (cf., Table 1 and the related description in Subsection 2.2). Now,
let us consider the mean shifts from µ(t) to
µ1(t) = µ(t) + δ, for t ∈ [0, 1],
that occur at the initial time point, where δ is the shift size taking values from 0 to 2.0 with a step
of 0.1. The basic time unit in this example is w = 0.001, and the sampling rate d is chosen to be 2,
5, or 10. In the CUSUM chart (8)-(9), its ATS0 value is fixed at 100, and the allowance constant
k is chosen to be 0.1, 0.2 or 0.5. The corresponding ρ values are computed to be 2.820, 2.351, and
1.493 when d = 2; 4.612, 3.713, and 2.234 when d = 5; and 6.407, 4.937, and 2.852 when d = 10.
The ATS1 values of the chart in cases when d = 2, 5, 10, computed based on 10,000 replicated
simulations, are shown in Figure 1(a)-(c). From the plots, it can be seen that: (i) the chart with a
smaller (larger) k performs better for detecting small (large) shifts, and (ii) the ATS1 values tend
16
-
to be smaller when d is larger. The first result is generally true for CUSUM charts (cf., Hawkins
and Olwell 1998), and the second result is intuitively reasonable because more observations are
available for process monitoring when d is chosen larger.
0.0 0.5 1.0 1.5 2.0delta
Ave
rage
Tim
e to
Sig
nal
25
1540
100
k=0.1k=0.2k=0.5
(a)
0.0 0.5 1.0 1.5 2.0delta
k=0.1k=0.2k=0.5
(b)
0.0 0.5 1.0 1.5 2.0delta
k=0.1k=0.2k=0.5
(c)
0.0 0.5 1.0 1.5 2.0delta
Ave
rage
Tim
e to
Sig
nal
2535
5070
100
k=0.1k=0.2k=0.5
(d)
0.0 0.5 1.0 1.5 2.0delta
k=0.1k=0.2k=0.5
(e)
0.0 0.5 1.0 1.5 2.0delta
k=0.1k=0.2k=0.5
(f)
Figure 1: (a)-(c) ATS1 values of the chart (8)-(9) in cases when the IC process mean function µ(t)and the process variance function σ2y(t) are assumed known, k = 0.1, 0.2, 0.5, ATS0=100, step shiftsize δ changes its value from 0 to 2 with a step of 0.1, and the sampling rate d = 2 (plot (a)), d = 5(plot (b)), and d = 10 (plot (c)). (d)-(f) Corresponding results when the mean shift is a nonlineardrift from µ(t) to µ1(t) = µ(t) + δ(1 − exp(−10t)), for t ∈ [0, 1]. The y-axes of all plots are innatural log scale.
In practice, a more realistic scenario would be that, once a shift occurs, the shift size would
change over time. In the total cholesterol level example discussed in Section 1, if a patient’s total
cholesterol level begins to deviate from the regular pattern and no medical treatments are given in
a timely fashion, then his/her total cholesterol level could deviate from the regular pattern more
and more over time. To simulate such scenarios, we consider a nonlinear drift from µ(t) to
µ1(t) = µ(t) + δ (1− exp(−10t)) , for t ∈ [0, 1]
that occurs at the initial time point, where δ is a constant. When δ changes from 0 to 2.0 with a
17
-
step of 0.1 and other settings remain the same as those in Figure 1(a)-(c), the ATS1 values of the
chart (8)-(9) are shown in Figure 1(d)-(f). From the plots, similar conclusions to those from Figure
1(a)-(c) can be made here. We also performed simulations in cases when w = 0.005 and ATS0 = 20.
The results are included in a supplementary file, and they demonstrate a similar pattern to that in
Figure 1.
Next, we consider a more realistic situation when the IC mean function µ(t) and the IC variance
function σ2y(t) are both unknown and they need to be estimated from an IC dataset. Assume that
the IC dataset includes observations of m individuals. For each individual, the sampling rate is d,
which is assumed to be the same as that of the new individual for online monitoring, and the basic
time unit is ω = 0.001. Then, when d = 2, each individual in the IC dataset has 200 observations
in the design interval [0, 1]. After µ(t) and σ2y(t) are estimated from the IC dataset, observations of
the new individual are standardized by (7) for online monitoring. The standardized observations
{ǫ̂(t∗j )} are regarded as i.i.d. random variables with the common distribution N(0, 1) when the new
individual is IC. When ATS0, k and d are given, the control limit ρ can be computed as if µ(t) and
σ2y(t) are known (cf., Table 1). Then, the actual ATS0 value of the chart (8)-(9) is computed using
10,000 simulations of the online monitoring. This entire process, starting from the generalization of
the IC data to the computation of the actual ATS0 value, is then repeated 100 times. The averaged
actual ATS0 values in cases when the nominal ATS0 = 100, k = 0.1, 0.2 or 0.5, d = 2, 5 or 10, and
m = 5, 10 or 20 are presented in Figure 2, along with the ATS1 values of the chart for detecting
step shifts of size δ = 0.25, 0.5, 0.75, and 1.0, occurring at the initial time point. These ATS0 and
ATS1 values are also presented in Table S.1 of the supplementary file. From the figure and the
table, it can be seen that: (i) the actual ATS0 values are within 5% of the nominal ATS0 value of
100 in all cases when m ≥ 10, (ii) the actual ATS0 values are closer to the nominal ATS0 value
when d is larger, (iii) the ATS1 values are smaller if δ is larger or d is larger, (iv) the ATS1 values
do not depend on the m value much, especially when δ and d are large, and (v) the ATS1 values
are the smallest when k = 0.1 and δ is small (i.e., δ = 0.25), or when k = 0.2 and δ is medium
(e.g., δ = 0.5 or 0.75), or when k = 0.5 and δ is large (i.e., δ = 1.0). The corresponding results for
detecting the drifts considered in Figure 1(d)-(f) are presented in Figure S.2 and Table S.2 of the
supplementary file. Results in cases when ATS0 = 20 are presented in Tables S.3 and S.4 of the
supplementary file. Similar conclusions can be made from all these results.
In the above examples, the sampling rate d is chosen as 2, 5, and 10. In cases when ω = 0.001,
18
-
0.0 0.2 0.4 0.6 0.8 1.0
020
4060
8010
0
delta
ATS
m=5m=10m=20
0.0 0.2 0.4 0.6 0.8 1.0delta
m=5m=10m=20
0.0 0.2 0.4 0.6 0.8 1.0delta
m=5m=10m=20
0.0 0.2 0.4 0.6 0.8 1.0
020
4060
8010
0
delta
ATS
m=5m=10m=20
0.0 0.2 0.4 0.6 0.8 1.0delta
m=5m=10m=20
0.0 0.2 0.4 0.6 0.8 1.0delta
m=5m=10m=20
0.0 0.2 0.4 0.6 0.8 1.0
020
4060
8010
0
delta
ATS
m=5m=10m=20
0.0 0.2 0.4 0.6 0.8 1.0delta
m=5m=10m=20
0.0 0.2 0.4 0.6 0.8 1.0delta
m=5m=10m=20
Figure 2: Actual ATS0 and ATS1 values of the chart (8)-(9), for detecting step mean shift of thesize δ occurring at the initial time point, in cases when the IC mean function µ(t) and the ICvariance function σ2y(t) are estimated from an IC data with m individuals, m = 5, 10 or 20, k = 0.1(1st column), 0.2 (2nd column) or 0.5 (3rd column), d = 2 (1st row), 5 (2nd row) or 10 (3rd row),ω = 0.001, and the nominal ATS0 is 100.
19
-
each individual would have 200, 500, and 1000 observations, respectively, in the design interval
[0, 1]. In some applications, such as the total cholesterol level example discussed in Section 4, the
individuals have fewer observations. Next, in the example of Figure 2, we choose d to be 0.2, 0.5,
and 1, m to be 10, 20, and 40, and the other parameters to be unchanged. The corresponding
results of the actual ATS0 and ATS1 values of the chart (8)-(9) for detecting the step shifts are
shown in Table S.5 and Figure S.3 of the supplementary file. The results for detecting the drifts
are shown in Table S.6 and Figure S.4 of the supplementary file. From these figures and tables, it
can be seen that most conclusions made above from Figures 2 and S.2 are still valid here, except
that (i) for a given m value, the actual ATS0 values in this case seem a little farther away from
the nominal ATS0 value of 100, compared to the actual ATS0 values shown in Figures 2 and S.2,
(ii) it requires m ≥ 20 in this case so that the actual ATS0 values would be within about 5% of
the nominal ATS0 value, and (iii) the actual ATS1 values in this case are generally larger than the
corresponding values shown in Figures 2 and S.2. All these results are intuitively reasonable.
In the literature, there are some existing control charts designed specifically for detecting time-
varying mean shifts, or mean drifts. One type of such control charts is based on the idea to design
a chart adaptively using a dynamic prediction of the shift size. Such adaptive CUSUM or EWMA
charts (cf., Capizzi and Masarotto 2003, Shu and Jiang 2006, Sparks 2000) can be used in our
proposed DySS method in place of the conventional CUSUM chart (8)-(9). Another control chart
designed for detecting time-varying mean shifts is the reference-free cuscore (RFCuscore) chart by
Han and Tsung (2006). Next, we consider using the adaptive CUSUM chart by Shu and Jiang
(2006) and the RFCuscore chart by Han and Tsung (2006) in our DySS method. By the adaptive
CUSUM chart, the charting statistic C+j in (8) needs to be changed to
C+j,A = max[0, C+j−1,A +
(ǫ̂(t∗j )− δ̂
+j /2
)/ρ(δ̂+j /2)
], for j ≥ 1,
where C+0,A = 0, δ̂+j is the predicted shift size at time j defined by
δ̂+j = max(δ+min, (1− λ)δ̂
+j−1 + λǫ̂(t
∗j )), for j ≥ 1,
δ̂+0 = δ+min, δ
+min is a parameter denoting the minimum shift size, λ ∈ (0, 1] is a weighting parameter,
ρ(k) is a control limit defined by
ρ(k) = [log(1 + 2k2ARL0 + 2.332k)]/2k − 1.166,
and ARL0 = ATS0 ∗ d/10 is the pre-specified IC ARL value. The chart gives a signal at time j
when C+j,A > 1. As in Shu and Jiang (2006), we choose λ = 0.1 in this procedure. The RFCuscore
20
-
chart gives a signal at time j if
max1≤w≤j
j∑
i=j−w+1
|ǫ̂(t∗i )| (ǫ̂(t∗i )− |ǫ̂(t
∗i )|/2)
≥ c,
where c is a control limit chosen to reach a given ATS0 level. In the example of Figures 2 and
S.2, the actual ATS0 and ATS1 values of the adaptive CUSUM chart and the RFCuscore chart
are computed in the same way as those of the chart (8)-(9), in cases when δ+min is chosen to be
0.2, 0.4, and 1.0 and other parameters are chosen to be the same values as those in Figures 2 and
S.2. These values are presented in Tables S.7-S.9 and Figures S.5-S.7 of the supplementary file. In
cases when m = 20, k = 0.2 in the chart (8)-(9), and δ+min = 0.4 in the adaptive CUSUM chart,
the computed actual ATS0 and ATS1 values of the three charts are shown in Figure S.8 in the
supplementary file, for detecting both step shifts and drifts. From these results, it can be seen
that (i) the conventional CUSUM chart (8)-(9) and the adaptive CUSUM chart perform almost
identically in all cases considered, and (ii) the RFCuscore chart performs slightly worse in certain
cases, especially when detecting drifts.
The supplementary file contains some results in cases when observations at different time
points are correlated and their error terms follow the AR(1) model (11). These results show that
the CUSUM chart (14)-(15), which is based on the standardized error terms, is indeed effective
in detecting mean shifts in the original process observations. In cases when µ(t), σ2y(t) and φ are
unknown and they need to be estimated from an IC dataset, it seems that more IC data are needed
to make the actual ATS0 values of the chart (14)-(15) to be within 10% of the nominal ATS0 value,
compared to the results in Figure 2 when within-subject observations are independent. Finally,
in the supplementary file, we also consider an example in which the within-subject observations
are correlated, but the correlation does not follow the AR(1) model (11). The block bootstrap
procedure discussed in Section 2.4 is used in designing our proposed DySS method. From the
results in this example, it can be seen that the DySS method still performs well in such cases.
4 Application to the Total Cholesterol Level Example
In this section, we apply our proposed DySS method to the total cholesterol level example that
is briefly described in Section 1. This example is a part of the SHARe Framingham Heart Study
of the National Heart Lung and Blood Institute (cf., Cupples et al. 2007). In the data, there are
21
-
1028 non-stroke patients (i.e. m = 1028) and 27 stroke patients. In the study, each patient was
followed 7 times (i.e., J = 7), and the total cholesterol level (in mg/100ml) was recorded at each
time. Because the total cholesterol level, denoted as y, is an important risk factor of stroke, it is
important to detect its irregular temporal pattern so that medical interventions can be applied in a
timely manner and stroke can be avoided. Here, we demonstrate that our proposed DySS method
can be used for identifying irregular total cholesterol level patterns in this example.
First, the observed data of the 1028 non-stroke patients are used as the IC data, from which
we can estimate the IC mean function µ(t) and the IC variance function σ2y(t). To this end, the
covariance function V (s, t) is first estimated by (5), and then µ(t) and σ2y(t) are estimated by
µ̂(t; Ṽ ) and σ̂2y(t), respectively, as described in Section 2.1. The 95% pointwise confidence band
of µ(t), defined to be µ̂(t; Ṽ ) ± 1.96σ̂y(t), and µ̂(t; Ṽ ) are presented in Figure 3 with the observed
total cholesterol levels of the 27 stroke patients. From the figure, it can be seen that only 6 out of
27 stroke patients are detected to have irregular total cholesterol level patterns by this confidence
interval approach because their observed total cholesterol levels exceed the upper confidence levels
at least once. From the figure, it can also be seen that there are no stroke patients whose observed
total cholesterol levels exceed the lower confidence levels.
After µ(t) and σ2y(t) are estimated, we can compute the standardized residuals by
ǫ̂(tij) =y(tij)− µ̂(tij ; Ṽ )
σ̂y(tij), for i = 1, ...,m, j = 1, ..., J.
To address the possible autocorrelation among {ǫ̂(tij)}, we consider the AR(1) model (11) (see also
(12)), and the coefficient φ in the model is estimated by the LS procedure described in Section 2.3
to be φ̂ = 0.9217. The goodness-of-fit of the AR(1) model is studied as follows. First, the predicted
values of the model (cf., (12))
ǫ̃(tij) = φ̂∆i,j−1 ǫ̂(ti,j−1)
are ordered from the smallest to the largest and then divided into g groups of the same size. Namely,
the kth group includes all predicted values in the interval Ik = [qk−1, qk), for k = 1, 2, . . . , g, where
qk is the (k/g)th quantile of the predicted values {ǫ̃(tij)}, for k = 1, 2, . . . , g − 1, q0 = −∞, and
qg = ∞. Let Ok be the proportion of all standardized residuals {ǫ̂(tij)} in the interval Ik, for
k = 1, 2, . . . , g. Then, the following Pearson’s χ2 statistic:
X2 =
g∑
k=1
(Ok − 1/g)2
1/g
22
-
20 30 40 50 60 70 80
5010
015
020
025
030
035
0
Age
Tota
l Cho
lest
erol
Lev
el (
g/m
l)
Figure 3: The 95% pointwise confidence band of the mean total cholesterol level µ(t) (dashedcurves), the point estimator µ̂(t; Ṽ ) of µ(t) (dark solid curve), and the observed total cholesterollevels of the 27 stroke patients (thin curves).
measures the discrepancy between the distribution of the standardized residuals and the distribution
of their predicted values by the AR(1) model. If the AR(1) model fits the data well, then the
distribution of X2 should be approximately χ2g−2, where the degree of freedom is g − 2 because
there is one parameter (i.e., φ) in the AR(1) model that needs to be estimated beforehand. If we
choose g = 12, then the observed value of X2 is 0.2265, giving a p-value of about 1.0. We also
tried other g values in the interval [5, 50], and similarly large p-values were obtained. Therefore,
the AR(1) model seems to fit the standardized residuals well.
After µ(t), σ2y(t) and φ are all estimated, next we design a CUSUM chart to detect irregular
longitudinal patterns of the total cholesterol levels of the 27 stroke patients. In this example,
because we are mainly concerned about the upward shifts in patients’ total cholesterol levels, only
the upward CUSUM chart (14)-(15) is considered here. Since each stroke patient has an average
of 2.3 observations every 10 years (cf., Figure 3), we choose d = 2. In the CUSUM chart, we
choose k = 0.1 and ATS0 = 25. In such cases, the control limit ρ is computed to be 0.927. The
CUSUM charts for monitoring the 27 stroke patients are presented in Figure 4, from which we can
see that 22 out of the 27 stroke patients are detected to have upward mean shifts. The 22 signal
23
-
times, computed from the starting point of each process monitoring, are listed in Table 2. The
average signal time is 13.818 years. Compared to the confidence interval approach described at
the beginning of this section, it can be seen that our DySS approach is more effective in detecting
irregular longitudinal patterns of the total cholesterol levels.
Table 2: Signal times (years) of the 22 stroke patients by our proposed CUSUM chart (14)-(15).
Patient ID Signal time Patient ID Signal time
2 12 16 193 13 17 124 12 18 85 22 19 126 11 20 167 8 21 89 23 22 911 24 23 1112 8 24 1613 7 25 2615 8 27 19
5 Concluding Remarks
We have described the DySS screening system for dynamically identifying individuals with
irregular longitudinal patterns. This approach tries to make use of all available information when
monitoring the longitudinal pattern of an individual, including the information from an IC popula-
tion of certain well-functioning individuals and all history data of the individual being monitored,
using both longitudinal data analysis and SPC techniques. Our numerical examples have shown
that it is effective in various cases. Although this paper focuses on the monitoring of the mean
function of the longitudinal response, similar DySS screening systems can be developed for mon-
itoring the variance function and other features of the response that may affect the longitudinal
performance of the individuals in a specific application. This approach can also be generalized to
cases when multiple responses are present.
In this paper, we have discussed cases when the observation times are regularly spaced, ob-
servation times are irregularly spaced, and the observations at different time points are correlated.
The cases when observation times are irregularly spaced, the observations at different time points
are correlated, and the correlation needs to be described by a parametric time series model are
24
-
j
Cj+
0
1
2
3
4
5
1 2 3 4 5 6
patient 19 patient 20
1 2 3 4 5 6
patient 21 patient 22
1 2 3 4 5 6
patient 23 patient 24
1 2 3 4 5 6
patient 25 patient 26
1 2 3 4 5 6
patient 27
patient 10 patient 11 patient 12 patient 13 patient 14 patient 15 patient 16 patient 17
0
1
2
3
4
5
patient 180
1
2
3
4
5
patient 1
1 2 3 4 5 6
patient 2 patient 3
1 2 3 4 5 6
patient 4 patient 5
1 2 3 4 5 6
patient 6 patient 7
1 2 3 4 5 6
patient 8 patient 9
Figure 4: CUSUM charts (14)-(15) with (k, h,ATS0) = (0.1, 0.927, 25) for monitoring the 27 strokepatients. The dashed horizontal lines denote the control limit ρ.
especially challenging, because there is little existing discussion in the literature about time series
modeling in such cases. In the paper, only the case when the AR(1) model (11) is appropriate
is discussed. More future research is required to handle cases with more flexible parametric time
series models.
Supplementary Materials
supplemental.pdf: This pdf file discusses some statistical properties of the estimator µ̂(t; Ṽ )
defined in Section 2.1, some technical details about the adjustment procedure (15) in Li
(2011) and the CV procedure discussed in Section 2.1, and some extra numerical results.
ComputerCodesAndData.zip: This zip file contains MATLAB source codes of our proposed
DySS method and the stroke data used in the paper.
25
-
Acknowledgments: We thank the editor, the associate editor, and two referees for many
constructive comments and suggestions, which greatly improved the quality of the paper. This
research is supported in part by an NSF grant and two grants from Natural Sciences Foundation
of China.
References
Apley, D.W., and Tsung, F. (2002), “The autoregressive T 2 chart for monitoring univariate auto-
correlated processes,” Journal of Quality Technology, 34, 80–96.
Brockwell, P., Davis, R., and Yang, Y. (2007), “Continuous-time Gaussian autoregression,” Sta-
tistica Sinica, 17, 63–80.
Capizzi, G., and Masarotto, G. (2003), “An adaptive exponentially weighted moving average
control chart,” Technometrics, 45, 199–207.
Chen, K., and Jin, Z. (2005), “Local polynomial regression analysis of clustered data,” Biometrika,
92, 59–74.
Cupples, L.A. et al. (2007), The Framingham Heart Study 100K SNP genome-wide association
study resource: overview of 17 phenotype working group reports,” BMC Medical Genetics,
8, (Suppl 1): S1.
Ding, Y., Zeng, L., and Zhou, S. (2006), “Phase I analysis for monitoring nonlinear profiles in
manufacturing processes,” Journal of Quality Technology, 38, 199–216.
Gan, F.F. (1995), “Joint monitoring of process mean and variance using exponentially weighted
moving average control charts,” Technometrics, 37, 446–453.
Han, D., and Tsung, F. (2006), “A reference-free cuscore chart for dynamic mean change detection
and a unified framework for charting performance comparison,” Journal of the American
Statistical Association, 101, 368–386.
Hawkins, D.M., and Deng, Q. (2010), “A nonparametric change-point control chart,” Journal of
Quality Technology, 42, 165–173.
26
-
Hawkins, D.M., and Olwell, D.H. (1998), Cumulative Sum Charts and Charting for Quality Im-
provement, New York: Springer-Verlag.
Jensen, W.A., and Birch, J.B. (2009), “Profile monitoring via nonlinear mixed models,” Journal
of Quality Technology, 41, 18–34.
Jiang, W. (2004), “Multivariate control charts for monitoring autocorrelated processes,” Journal
of Quality Technology, 36, 367–379.
Jiang, W., Tsui, K.L., and Woodall, W. (2000), “A new SPC monitoring method: the ARMA
chart,” Technometrics, 42, 399–410.
Knoth, S. (2011), “spc: Statistical Process Control,” R package version 0.4.0, http://CRAN.R-
project.org/package=spc.
Lahiri, S.N. (2003), Resampling Methods for Dependent Data, New York: Springer.
Li, Y. (2011), “Efficient semiparametric regression for longitudinal data with nonparametric co-
variance estimation,” Biometrika, 98, 355–370.
Lu, C.W., and Reynolds, M.R., Jr. (2001), “Control charts for monitoring an autocorrelated
process,” Journal of Quality Technology, 33, 316–334.
Ma, S., Yang, L., and Carroll, R. (2012), “A simultaneous confidence band for sparse longitudinal
regression,” Statistica Sinica, 22, 95–122.
Maller, R.A., Müller, G., and Szimayer, A. (2008), “GARCH modelling in continuous time for
irregularly spaced time series data,” Bernoulli, 14, 519–542.
Montgomery, D. C. (2009), Introduction To Statistical Quality Control (6th edition), New York:
John Wiley & Sons.
Moustakides, G.V. (1986), “Optimal stopping times for detecting changes in distributions,” The
Annals of Statistics, 14, 1379–1387.
Pan, J., Ye, H., and Li, R. (2009), “Nonparametric regression of covariance structures in lon-
gitudinal studies,” Technical Report, 2009.46, School of Mathematics, The University of
Manchester, UK.
Qiu, P. (2013), Introduction to Statistical Process Control, London: Chapman & Hall/CRC.
27
-
Qiu, P. (2008), “Distribution-free multivariate process control based on log-linear modeling,” IIE
Transactions, 40, 664–677.
Qiu, P., and Hawkins, D. (2001), “A rank based multivariate CUSUM procedure,” Technometrics,
43, 120–132.
Qiu, P., and Li, Z. (2011a), “On nonparametric statistical process control of univariate processes,”
Technometrics, 53, 390–405.
Qiu, P., and Li, Z. (2011b), “Distribution-free monitoring of univariate processes,” Statistics and
Probability Letters, 81, 1833–1840.
Qiu, P., Zou, C., and Wang, Z. (2010), “Nonparametric profile monitoring by mixed effects mod-
eling (with discussions),” Technometrics, 52, 265–277.
Reynolds, M.R., Jr., Amin, R.W., and Arnold, J.C. (1990), “CUSUM charts with variable sam-
pling intervals,” Technometrics, 32, 371–384.
Shu, L., and Jiang, W. (2006), “A Markov chain model for the adaptive cusum control chart,”
Journal of Quality Technology, 38, 135–147.
Sparks, R.S. (2000), “CUSUM charts for signalling varying location shifts,” Journal of Quality
Technology, 32, 157–171.
Vityazev, V.V. (1996), “Time series analysis of unequally spaced data: the statistical properties
of the Schuster periodogram,” Astronomical and Astrophysical Transactions, 11, 159–173.
Yao, F., Müller, H.G., and Wang, J.L. (2005), “Functional data analysis for sparse longitudinal
data,” Journal of the American Statistical Association, 100, 577–590.
Yeh, A.B., Lin, D.K.J., and Venkataramani, C. (2004), “Unified CUSUM charts for monitoring
process mean and variability,” Quality Technology and Quantitative Management, 1, 65–86.
Zhao, Z., and Wu, W. (2008), “Confidence bands in nonparametric time series regression,” Annals
of Statistics, 36, 1854–1878.
Zou, C., and Tsung, F. (2010), “Likelihood ratio-based distribution-free EWMA control charts,”
Journal of Quality Technology, 42, 1–23.
28
IntroductionThe DySS MethodEstimation of the regular longitudinal patternSPC charts for dynamically identifying irregular individualsCases when the longitudinal observations follow AR(1) modelCases with arbitrary autocorrelation
Numerical StudyApplication to the Total Cholesterol Level ExampleConcluding Remarks