Univariate Dynamic Screening System: An Approach For ...Univariate Dynamic Screening System: An...

28
Univariate Dynamic Screening System: An Approach For Identifying Individuals With Irregular Longitudinal Behavior Peihua Qiu 1 and Dongdong Xiang 2 1 School of Statistics, University of Minnesota 2 School of Finance and Statistics, East China Normal University Abstract In our daily life, we often need to identify individuals whose longitudinal behavior is different from the behavior of those well-functioning individuals, so that some unpleasant consequences can be avoided. In many such applications, observations of a given individual are obtained se- quentially, and it is desirable to have a screening system to give a signal of irregular behavior as soon as possible after that individual’s longitudinal behavior starts to deviate from the regular behavior, so that some adjustments or interventions can be made in a timely manner. This paper proposes a dynamic screening system for that purpose in cases when the longitudinal behavior is univariate, using statistical process control and longitudinal data analysis techniques. Sev- eral different cases, including those with regularly spaced observation times, irregularly spaced observation times, and correlated observations, are discussed. Our proposed method is demon- strated using a real-data example about the SHARe Framingham Heart Study of the National Heart, Lung and Blood Institute. This paper has supplementary materials online. Key Words: Correlation; Dynamic screening; Longitudinal data; Process monitoring; Process screening; Standardization; Statistical process control; Unequal sampling intervals. 1 Introduction Most products (e.g., airplanes, cars, batteries of notebook computers) need to be checked regularly or occasionally about certain variables related to their quality and/or performance. If the observed values of the variables of a given product are significantly worse than the values of a typical well-functioning product of the same age, then some proper adjustments or interventions should be made to avoid any unpleasant consequences. This paper suggests a dynamic screening system for identifying such improperly functioning products effectively. 1

Transcript of Univariate Dynamic Screening System: An Approach For ...Univariate Dynamic Screening System: An...

  • Univariate Dynamic Screening System: An Approach For Identifying

    Individuals With Irregular Longitudinal Behavior

    Peihua Qiu1 and Dongdong Xiang2

    1School of Statistics, University of Minnesota

    2School of Finance and Statistics, East China Normal University

    Abstract

    In our daily life, we often need to identify individuals whose longitudinal behavior is different

    from the behavior of those well-functioning individuals, so that some unpleasant consequences

    can be avoided. In many such applications, observations of a given individual are obtained se-

    quentially, and it is desirable to have a screening system to give a signal of irregular behavior as

    soon as possible after that individual’s longitudinal behavior starts to deviate from the regular

    behavior, so that some adjustments or interventions can be made in a timely manner. This paper

    proposes a dynamic screening system for that purpose in cases when the longitudinal behavior

    is univariate, using statistical process control and longitudinal data analysis techniques. Sev-

    eral different cases, including those with regularly spaced observation times, irregularly spaced

    observation times, and correlated observations, are discussed. Our proposed method is demon-

    strated using a real-data example about the SHARe Framingham Heart Study of the National

    Heart, Lung and Blood Institute. This paper has supplementary materials online.

    Key Words: Correlation; Dynamic screening; Longitudinal data; Process monitoring; Process

    screening; Standardization; Statistical process control; Unequal sampling intervals.

    1 Introduction

    Most products (e.g., airplanes, cars, batteries of notebook computers) need to be checked

    regularly or occasionally about certain variables related to their quality and/or performance. If

    the observed values of the variables of a given product are significantly worse than the values of a

    typical well-functioning product of the same age, then some proper adjustments or interventions

    should be made to avoid any unpleasant consequences. This paper suggests a dynamic screening

    system for identifying such improperly functioning products effectively.

    1

  • The motivating example of this research is the SHARe Framingham Heart Study of the National

    Heart, Lung and Blood Institute. In the study, 1055 patients were recruited, each patient was

    followed 7 times, and the total cholesterol level of each patient was recorded during each of the 7

    clinic visits. Among the 1055 patients, 1028 of them did not have any strokes during the study, and

    the remaining 27 patients had at least one stroke. One major research goal of the study is to identify

    patients with irregular total cholesterol level patterns as early as possible, so that some medical

    treatments can be taken in a timely manner to avoid the occurrence of stroke. To this end, one

    traditional method is to construct pointwise confidence intervals for the mean total cholesterol level

    of the non-stroke patient population across different ages, using the longitudinal data of all non-

    stroke patients. Then, a patient can be identified as the one who has an irregular total cholesterol

    level pattern if his/her total cholesterol levels go beyond the confidence intervals. In the literature,

    there have been some discussions about the construction of such confidence intervals. See, for

    instance, Ma et al. (2012), Yao et al. (2005), and Zhao and Wu (2008).

    The confidence interval methods mentioned above use a cross-sectional comparison approach.

    In the context of the total cholesterol level example, they compare patients’ total cholesterol levels

    at the cross-section of a given age. One limitation of this approach is that it does not make use of

    the medical history of a given patient in the medical diagnostics. If the total cholesterol levels at

    different ages of a given patient are regarded as observations of a random process, then intuitively

    we should make use of all such observations of the process for medical diagnostics. If a patient’s

    total cholesterol level is consistently above the mean total cholesterol level of the non-stroke patient

    population for an enough long time period, then that patient should still be identified as a person

    who has an irregular total cholesterol level pattern, even if his/her total cholesterol level at any

    given time point is within the confidence interval of the mean total cholesterol level of the non-stroke

    patient population. Further, it is critical that the diagnostic method should be dynamic in the sense

    that a decision about a patient can be made at any time during the observation process as long as

    all observations up to the current time point have provided enough evidence for the decision. That

    is because the major purpose to monitor a patient’s total cholesterol level is to find any irregular

    pattern as soon as possible so that some necessary medical treatments can be given in a timely

    manner. The confidence interval methods mentioned in the previous paragraph usually do not

    have such a dynamic decision-making property, because they do not follow a patient sequentially

    over time. In the literature, some statistical process control (SPC) methodologies, including the

    2

  • cumulative sum (CUSUM) and the exponentially weighted moving average (EWMA) charts, are

    designed for such purposes (e.g., Hawkins and Olwell 1998, Montgomery 2009, Qiu 2013). These

    control charts evaluate the performance of a process sequentially based on all observed data up

    to the current time point. However, they cannot be applied to the current problem directly for

    the following several reasons. First, most SPC methods are for monitoring a single process (e.g.,

    a production line, a medical system). When the process performs satisfactorily, its observations

    follow a so-called in-control (IC) distribution and the process is said to be IC. The major purpose

    of the SPC control charts is to detect any distributional shift of the process observations as soon as

    possible so that the related process can be stopped promptly for adjustments and repairs. In the

    current problem, if each patient is regarded as a process, then there are many processes involved. To

    judge whether a patient’s total cholesterol level is IC, we need to take the total cholesterol levels of

    all non-stroke patients into consideration. A conventional SPC control chart is not designed for that

    purpose. Second, in a typical SPC problem, if we are interested in monitoring the process mean,

    then the process mean is assumed to be a constant when the process is IC. In the current problem,

    however, the mean total cholesterol level of the non-stroke patients would change over time. Because

    of these fundamental differences between the conventional SPC problem and the current problem,

    the existing SPC methodologies cannot be applied directly to the current problem.

    In this paper, we focus on cases when the variable to monitor (e.g., the total cholesterol level)

    is univariate and we are interested in monitoring its mean. In such cases, we propose a dynamic

    screening system (DySS) to dynamically identify individuals with irregular longitudinal patterns. In

    the DySS method, the regular longitudinal pattern is first estimated from the observed longitudinal

    data of a group of well-functioning individuals (e.g., the non-stroke patients in the stroke example).

    Then, the estimated regular longitudinal pattern is used for standardizing the observations of an

    individual to monitor. Finally, an SPC control chart is applied to the standardized observations of

    that individual for online monitoring of its longitudinal behavior.

    It should be pointed out that the DySS problem described above looks like a profile monitoring

    problem in SPC. But, the two problems are actually completely different. In the profile monitoring

    problem, observations obtained from a sampled product are a profile that describes the relationship

    between a response variable and one or more explanatory variables, the profiles from different

    sampled products are usually ordered by observation times or spatial locations of the sampled

    products, and the major goal of the profile monitoring problem is for detecting any shift in the

    3

  • mean and/or variance of the profiles over time/location (cf., e.g., Ding et al. 2006, Jensen and

    Birch 2009, Qiu et al. 2010). As a comparison, in the DySS problem, although observations of each

    individual look like a profile, each individual is actually treated as a separate process and we are

    mainly interested in monitoring the longitudinal behavior of each individual. Different individuals

    are often not ordered by time or location.

    The remainder of the article is organized as follows. Our proposed DySS methodology is

    described in detail in Section 2. A simulation study for evaluating its numerical performance is

    presented in Section 3. Our proposed method is demonstrated using the motivating example about

    patients’ total cholesterol levels in Section 4. Some concluding remarks are given in Section 5.

    Theoretical properties of the estimated IC mean function of the longitudinal response are discussed

    in a supplementary file, along with certain technical details and numerical results.

    2 The DySS Method

    Our proposed DySS method is described in four parts. In the first part, estimation of the regular

    longitudinal pattern from the observed longitudinal data of a group of well-functioning individuals

    is discussed. Then, an SPC chart is proposed in the second part for monitoring individuals dy-

    namically, after their observations are standardized by the estimated regular longitudinal pattern.

    In this part, longitudinal observations are assumed to be independent and normally distributed.

    Situations when they are normally distributed and autocorrelated and the autocorrelation can be

    described by an AR(1) time series model are discussed in the third part. Finally, general cases when

    observations are autocorrelated with an arbitrary autocorrelation and when their distribution is

    non-normal are discussed in the fourth part.

    2.1 Estimation of the regular longitudinal pattern

    Assume that y is the variable that we are interested in monitoring, and observations of y for a

    group of m well-functioning individuals follow the model

    y(tij) = µ(tij) + ε(tij), for i = 1, 2, . . . ,m, j = 1, 2, . . . , J, (1)

    where tij is the jth observation time of the ith individual, y(tij) is the observed value of y at

    tij , µ(tij) is the mean of y(tij), and ε(tij) is the error term. For simplicity, we assume that the

    4

  • design space is [0, 1]. As in the literature of longitudinal data analysis (e.g., Li 2011), we further

    assume that the error term ε(t), for any t ∈ [0, 1], consists of two independent components, i.e.,

    ε(t) = ε0(t) + ε1(t), where ε0(·) is a random process with mean 0 and covariance function V0(s, t),

    for any s, t ∈ [0, 1], and ε1(·) is a noise process satisfying the condition that ε1(s) and ε1(t) are

    independent for any s, t ∈ [0, 1]. In this decomposition, ε1(t) denotes the pure measurement error,

    and ε0(t) denotes all possible covariates that may affect y but are not included in model (1). In

    such cases, for any s, t ∈ [0, 1], the covariance function of ε(·) is

    V (s, t) = Cov (ε(s), ε(t)) = V0(s, t) + σ21(s)I(s = t), for any s, t ∈ [0, 1], (2)

    where σ21(s) = Var(ε1(s)), and I(s = t) = 1 when s = t and 0 otherwise. In model (1), observations

    of different individuals are assumed to be independent. As a sidenote, model (1) is similar to the

    nonparametric mixed-effects model discussed in Qiu et al. (2010). Their major difference is that

    the variance of the pure measurement error ε1(t) in (1) is allowed to vary over t, while such a

    variance is assumed to be a constant in Qiu et al. (2010). Also, J is assumed to be a constant in

    (1) for simplicity. Our method discussed in the paper can be generalized to cases when J depends

    on i without much difficulty.

    For the longitudinal model specified by (1) and (2), Chen et al. (2005) and Pan et al. (2009)

    propose the following estimator of µ(t) based on the local pth-order polynomial kernel smoothing

    procedure:

    µ̂(t;V ) = e′1

    (m∑

    i=1

    X ′iV−1i Xi

    )−1( m∑

    i=1

    X ′iV−1i yi

    ), (3)

    where e1 is a (p+ 1)-dimensional vector with 1 at the first component and 0 anywhere else,

    Xi =

    1 (ti1 − t) . . . (ti1 − t)p

    ......

    . . ....

    1 (tiJ − t) . . . (tiJ − t)p

    J×(p+1)

    ,

    V −1i = K1/2ih (t)(IiΣiIi)

    −1K1/2ih (t), Kih(t) = diag{K((ti1 − t)/h), . . . ,Kh((tiJ − t)/h)}/h, K is a

    density kernel function, h > 0 is a bandwidth, Ii = diag{I(|ti1 − t| ≤ 1), . . . , I(|tiJ − t| ≤ 1)}, and

    Σi is the covariance matrix of yi = (y(ti1), y(ti2), . . . , y(tiJ))′ which can be computed from V (s, t)

    in (2). Throughout this paper, the inverse of a matrix refers to the Moore-Penrose generalized

    inverse, which always exists.

    In practice, the error covariance function V (s, t) is usually unknown, and it needs to be es-

    timated from the observed data. To this end, we propose a method consisting of several steps,

    5

  • described below. First, we compute an initial estimator of µ(t), denoted as µ̃(t), by (3) in which

    all Σi are replaced by the identity matrix. Obviously, µ̃(t) is just the local pth-order polynomial

    kernel estimator based on the assumption that the error terms in (1) at different time points are

    i.i.d. for each individual subject. Second, we define residuals

    ε̃(tij) = y(tij)− µ̃(tij), for i = 1, 2, . . . ,m, j = 1, 2, . . . , J.

    Third, when s 6= t, we use the method originally proposed by Li (2011) to first estimate V0(s, t) in

    (2) by

    Ṽ0(s, t) = (A1(s, t)V00(s, t)−A2(s, t)V10(s, t)−A3(s, t)V01(s, t))B−1(s, t), (4)

    whereA1(s, t) = S20(s, t)S02(s, t)−S211(s, t), A2(s, t) = S10(s, t)S02(s, t)−S01(s, t)S11(s, t), A3(s, t) =

    S01(s, t)S20(s, t)− S10(s, t)S11(s, t), B(s, t) = A1(s, t)S00(s, t)−A2(s, t)S10(s, t)−A3(s, t)S01(s, t),

    Sl1l2(s, t) =1

    mJ(J − 1)

    m∑

    i=1

    J∑

    j=1

    j 6=j′

    (tij − s

    h

    )l1 ( tij′ − th

    )l2Kh(tij − s)Kh(tij′ − t),

    Vl1l2(s, t) =1

    mJ(J − 1)

    m∑

    i=1

    J∑

    j=1

    j 6=j′

    ε̃(tij)ε̃(tij′ )

    (tij − s

    h

    )l1 ( tij′ − th

    )l2Kh(tij − s)Kh(tij′ − t),

    Kh(tij − s) = K((tij − s)/h)/h, and l1, l2 = 0, 1, 2. Note that the matrix Ṽ0(s, t) in (4) may not be

    semipositive definite. To address this issue, the adjustment procedure (15) in Li (2011) can effec-

    tively regularize it to be a well defined covariance matrix, which is described in the supplementary

    file. Then, the variance of y(t), denoted as σ2y(t) = V0(t, t) + σ21(t), can be regarded as the mean

    function of ε2(t), and it can be estimated from the new data {ε̃2(tij), i = 1, 2, . . . ,m, j = 1, 2, . . . , J}

    by (3), in which yi are replaced by (ε̃2(ti1), . . . , ε̃

    2(tiJ))′ and Σi are replaced by the identity matrix.

    The resulting estimator is denoted as σ̃2y(t). Finally, we define the estimator of V (s, t) by

    Ṽ (s, t) = Ṽ0(s, t)I(s 6= t) + σ̃2y(t)I(s = t). (5)

    Consequently, the mean function µ(t) can be estimated by µ̂(t; Ṽ ).

    Note that, in the four steps described above for computing µ̃(t), Ṽ0(s, t), σ̃2y(t), and µ̂(t; Ṽ ), the

    bandwidth h could be chosen differently in each step. We did not distinguish them in the above

    description for simplicity. In all our numerical examples presented later, they are allowed to be

    different, and each of them is chosen separately and sequentially by the conventional cross-validation

    (CV) procedure (cf., the description in the supplementary file) as follows. First, the bandwidth

    for µ̃(t) is selected. Then, the bandwidths for Ṽ0(s, t) and σ̃2y(t) are selected, respectively. Finally,

    6

  • the bandwidth for µ̂(t; Ṽ ) can be selected. Some theoretical properties of µ̂(t; Ṽ ) are given in a

    proposition of the supplementary file, which shows that µ̂(t; Ṽ ) is statistically consistent under

    some regularity conditions.

    By the estimation procedure described above, we can compute the local polynomial kernel

    estimator µ̂(t; Ṽ ) of the mean function µ(t) in cases when the error covariance function V (s, t) is

    unknown. After µ(t) is estimated, the variance function σ2y(t) can be estimated in the same way,

    except that the original observations {y(tij)} need to be replaced by

    ε̂2(tij) = (y(tij)− µ̂(tij ; Ṽ ))2, for i = 1, 2, . . . ,m, j = 1, 2, . . . , J.

    The resulting estimator of σ2y(t) is denoted as σ̂2y(t).

    The models (1) and (2) are quite general and they include a variety of different situations as

    special cases. In certain applications, the variance structure of the observed data can be described

    by a simpler model. For instance, in the SPC literature, we often consider cases when the error

    terms at different time points are independent of each other or they follow a specific time series

    model. In the current setup, the first scenario corresponds to the case when V0(s, t) = 0, for any

    s, t ∈ [0, 1]. In such cases, µ(t) and σ2y(t) can be estimated in the same way as described above,

    except that we do not need to estimate V0(s, t).

    2.2 SPC charts for dynamically identifying irregular individuals

    In the previous part, we define the estimators µ̂(t; Ṽ ) and σ̂2y(t) of the IC mean function µ(t)

    and the IC variance function σ2y(t), based on the observed longitudinal data of m well-functioning

    individuals which are sometimes called IC data for simplicity. In this part, we propose SPC charts

    for sequentially monitoring the longitudinal behavior of a new individual, using the estimated

    regular pattern of the longitudinal response variable y(t) that is described by µ̂(t; Ṽ ) and σ̂2y(t).

    Assume that the new individual’s y values are observed at times t∗1, t∗2, . . . over the design

    interval [0, 1]. In cases when that individual’s longitudinal behavior is IC, these y values should

    also follow the model (1). For convenience of discussion, we re-write that model as

    y(t∗j ) = µ(t∗j ) + σy(t

    ∗j )ǫ(t

    ∗j ), for j = 1, 2, . . . (6)

    where σy(t)ǫ(t) here equals ε(t) in model (1). Thus, ǫ(t) in model (6) has mean 0 and variance 1

    7

  • at each t. For the y values of the new individual, let us define their standardized values by

    ǫ̂(t∗j ) =y(t∗j )− µ̂(t

    ∗j ; Ṽ )

    σ̂y(t∗j ), for j = 1, 2, . . . (7)

    In the case when {y(t∗j ), j = 1, 2, . . .} are independent of each other, and each of them has a normal

    distribution, then {ǫ̂(t∗j ), j = 1, 2, . . .} would be a sequence of asymptotically i.i.d. random variables

    with a common asymptotic distribution of N(0, 1). If we further assume that {t∗j , j = 1, 2, . . .}

    are equally spaced, then {ǫ̂(t∗j ), j = 1, 2, . . .} can be monitored by a conventional control chart,

    such as a CUSUM or EWMA chart, as usual. For instance, to detect upward mean shifts in

    {ǫ̂(t∗j ), j = 1, 2, . . .}, the charting statistic of the upward CUSUM chart can be defined by

    C+j = max(0, C+j−1 + ǫ̂(t

    ∗j )− k

    ), for j ≥ 1, (8)

    where C+0 = 0, and k > 0 is the allowance constant. The chart gives a signal of an upward mean

    shift if

    C+j > ρ, (9)

    where ρ > 0 is a control limit. To evaluate the performance of the CUSUM chart (8)-(9), we can

    use the IC average run length (ARL), denoted as ARL0, which is the average number of time points

    from the beginning of process monitoring to the signal time under the condition that the process

    is IC, and the out-of-control (OC) ARL, denoted as ARL1, which is the average number of time

    points from the occurrence of a shift to the signal time under the condition that the process is OC.

    Usually, the allowance k is specified beforehand. It has been well demonstrated in the literature

    that large k values are good for detecting large shifts and small k values are good for detecting

    small shifts. If it is possible to know the shift size δ in the mean of ǫ̂, then the optimal k is δ/2 (cf.,

    Moustakides 1986). The control limit ρ is chosen such that a pre-specified ARL0 value is reached.

    Then, the chart performs better for detecting a given shift if its ARL1 value is smaller. For some

    commonly used k and ARL0 values, the corresponding ρ values can be found from Table 3.2 in

    Hawkins and Olwell (1998). They can also be computed easily by some software packages, such as

    the R-package spc (Knoth 2011).

    In practice, however, the observation times {t∗j , j = 1, 2, . . .} may not be equally spaced, as

    in the total cholesterol level example discussed in Section 1. In such cases, ARL0 and ARL1 are

    obviously inappropriate for measuring the performance of the CUSUM chart (8)-(9), and we need to

    define new performance measures. To this end, let ω > 0 be a basic time unit in a given application,

    8

  • which is the largest time unit that all unequally spaced observation times are its integer multiples.

    Define

    n∗j = t∗j/ω, for j = 0, 1, 2, . . .

    where n∗0 = t∗0 = 0. Then, t

    ∗j = n

    ∗jω, for all j, and n

    ∗j is the jth observation time in the basic time

    unit. In cases when the new individual is IC and the CUSUM chart (8)-(9) gives a signal at the sth

    observation time, then n∗s is a random variable measuring the time to a false signal. Its mean E(n∗s)

    measures the IC average time to the signal (ATS), denoted as ATS0. If the new individual has an

    upward mean shift starting at the τth observation time and the CUSUM chart (8)-(9) gives a signal

    at the sth time point with s ≥ τ , then the mean of n∗s−n∗τ is called the OC ATS, denoted as ATS1.

    Then, to measure the performance of a control chart (e.g., the chart (8)-(9)), its ATS0 value can be

    fixed at a certain level beforehand, and the chart performs better if its ATS1 value is smaller when

    detecting a shift of a given size. It is obvious that the values of ATS0 and ATS1 are just constant

    multiples of the corresponding values of ARL0 and ARL1 in cases when the observation times are

    equally spaced. In such cases, the two sets of measures are equivalent. Also, when the observation

    times are unequally spaced, it is possible to use the IC average number of observations to signal

    (ANOS), denoted as ANOS0, and the OC average number of observations to signal, denoted as

    ANOS1, for measuring the performance of control charts. However, these quantities are also just

    constant multiples of ATS0 and ATS1, in cases when the sampling distribution does not change

    over time. Furthermore, when the sampling distribution changes over time, ANOS0 and ANOS1

    may not be able to measure the chart performance well. As a demonstration, let us consider two

    scenarios. In the first scenario, we collect observations at the following basic time units: 1, 3, 7,

    . . . , and a signal by a chart is given at the third observation time. In the second scenario, we collect

    observations at the following basic time units: 1, 30, 100, . . . , and a signal by another chart is also

    given at the third observation time. Assume that the processes are IC in both scenarios. Then, the

    IC performance of the two charts are the same if ANOS0 is used as a performance measure. But,

    the second chart does not give a false signal until the 100th basic time unit, while the first chart

    gives a false signal at the 7th basic time unit. Obviously, ANOS0 cannot tell this quite dramatic

    difference. For all these reasons, only ATS0 and ATS1 are used in the remaining part of the paper

    for measuring the performance of a control chart.

    If the IC mean function µ(t) and the IC variance function σ2y(t) are known, then the standard-

    9

  • ized observations of the new individual can be defined by

    ǫ(t∗j ) =y(t∗j )− µ(t

    ∗j )

    σy(t∗j ), for j = 1, 2, . . . (10)

    In such cases, as long as the distribution of the observation times {t∗j , j = 1, 2, . . .} is specified

    properly, for a given k value in (8) and a given ATS0 value, we can easily compute the corresponding

    value of the control limit ρ by simulation such that the pre-specified ATS0 value is reached. For

    instance, when the distribution of the observation times is specified by the sampling rate d, which is

    defined to be the number of observation times every 10 basic time units here, the computed ρ values

    in cases when ATS0 = 20, 50, 100, 150, 200, k = 0.1, 0.2, 0.5, 0.75, 1.0, and d = 2, 5, 10 are presented

    in Table 1. From the table, it can be seen that ρ increases with ATS0 and d, and decreases with

    k. In cases when µ(t) and σ2y(t) are unknown, they need to be estimated from an IC dataset by

    µ̂(t; Ṽ ) and σ̂2y(t), as discussed in Section 2.1. In such cases, the standardized observations of the

    new individual defined by (7) are only asymptotically N(0, 1) distributed when the new individual

    is IC. When the IC sample size is moderate to large, we can still use the control limit ρ computed

    in the case of (10), which will be justified by some numerical results described in the next section.

    Table 1: Control limits of the CUSUM chart (8)-(9) for known IC mean and variance functions andvarious combinations of ATS0, k, and d described in the text. For the entry with a “∗” symbol,the actual ATS0 value is more than 1% away from the assumed ATS0 value. For all other entries,the actual ATS0 values are within 1% of the assumed ATS0 values.

    d k ATS0 = 20 ATS0 = 50 ATS0 = 100 ATS0 = 150 ATS0 = 200

    0.1 0.929 1.844 2.820 3.552 4.1710.2 0.774 1.571 2.351 2.928 3.396

    2 0.5 0.382 0.986 1.493 1.822 2.0880.75 0.118 0.654 1.064 1.312 1.5071.0 0.001∗ 0.355 0.718 0.936 1.101

    0.1 1.844 3.215 4.612 5.631 6.4450.2 1.571 2.664 3.713 4.381 5.005

    5 0.5 0.986 1.685 2.234 2.586 2.8980.75 0.654 1.200 1.612 1.869 2.0721.0 0.355 0.837 1.189 1.404 1.562

    0.1 2.820 4.612 6.407 7.649 8.6120.2 2.351 3.713 4.937 5.744 6.408

    10 0.5 1.493 2.235 2.852 3.245 3.5420.75 1.061 1.610 2.039 2.310 2.5091.0 0.718 1.192 1.538 1.738 1.894

    In the SPC literature, the concept of ATS has been proposed for measuring the performance of

    10

  • a control chart with variable sampling intervals (VSIs). See, for instance, Reynolds et al. (1990).

    However, the VSI problem in SPC is completely different from the DySS problem discussed here. In

    the VSI problem, the next observation time is determined by the current and all past observations

    of the process; the next observation is collected sooner if there is more evidence of a shift based on

    all available process observations, and later otherwise. Therefore, the VSI scheme is an integrated

    part of process monitoring. As a comparison, in the DySS problem, observation times are often

    not specifically chosen for improving process monitoring. In the total cholesterol level example

    discussed in Section 1, for instance, patients often choose their clinic visit times based on their

    convenience or health conditions.

    At the end of this subsection, we would like to make several remarks about the DySS problem

    and the CUSUM chart (8)-(9). First, the above description focuses on cases to detect upward

    shifts in the process mean function µ(t) only. Control charts for detecting downward mean shifts or

    arbitrary mean shifts can be developed in the same way, except that the upward charting statistic

    in (8) and the decision rule in (9) should be changed to the downward or two-sided version (cf., Qiu

    2013, Chapter 4). Second, control charts for detecting shifts in the process variance function σ2y(t),

    or shifts in both µ(t) and σ2y(t), can be developed in a similar way, after (8) and (9) are replaced

    by appropriate charting statistics and the related decision rules. See, for instance, Gan (1995) and

    Yeh et al. (2004) for discussions about the corresponding problems in the conventional SPC setup.

    Third, although a CUSUM chart is discussed above, EWMA charts and charts based on change-

    point detection (CPD) can also be used in the DySS problem. Generally speaking, the CUSUM

    and EWMA charts have similar performance, and the CPD charts can provide estimates of the shift

    position at the time when they give signals although their computation is usually more extensive.

    See Qiu (2013) for a related discussion. Fourth, to use the proposed DySS method, the observation

    times {t∗j} of the new individual to monitor usually cannot be larger than the largest value of the

    observation times {tij} in the IC data, to avoid extrapolation in the data standardization in (7).

    Or, it is inappropriate to use our DySS method to monitor the new individual outside the time

    range

    [mini,j

    tij , maxi,j

    tij ].

    From the description about the DySS method above, it can be seen that this methodology actually

    combines the cross-sectional comparison between the new individual to monitor and the m well-

    functioning individuals in the IC data with the sequential monitoring of the new individual in

    11

  • a dynamic manner. Therefore, to use the method to monitor the longitudinal behavior of the

    new individual at time t, there should exist observations around t in the IC data to make the

    cross-sectional comparison possible. Fifth, although the simplest CUSUM chart (8)-(9) is used

    in this paper to demonstrate the DySS method, some existing research in the SPC literature to

    accommodate correlated data (e.g., Apley and Tsung 2002, Jiang 2004) or to detect dynamic mean

    shifts (e.g., Han and Tsung 2006, Shu and Jiang 2006) might also be considered here. In the

    numerical study presented in Section 3, the control charts by Han and Tsung (2006) and Shu and

    Jiang (2006) will be discussed.

    2.3 Cases when the longitudinal observations follow AR(1) model

    In the above discussion, we assume that the longitudinal observations {y(t∗j ), j = 1, 2, . . .}

    are independent. In practice, they are often correlated. In the SPC literature, time series models,

    especially the AR(1) model, are commonly used for describing the autocorrelation in such cases (cf.,

    Jiang et al. 2000, Lu and Reynolds 2001). In this section we discuss how to use the proposed DySS

    procedure in cases when the longitudinal observations follow an AR(1) model. In a conventional

    time series model, observation times are usually equally spaced. In the current DySS problem,

    however, they can be unequally spaced. In such cases, time series modeling is especially challenging

    (cf., Maller et al. 2008, Vityazev 1996), and our DySS approach with a more general time series

    model is not available yet, which is left for our future research.

    Let us first discuss estimation of the regular longitudinal pattern from an IC dataset when

    the longitudinal data are correlated and follow an AR(1) model. Assume that there are m well-

    functioning individuals whose longitudinal observations follow the model (1) in which ε(tij) =

    σy(tij)ǫ(tij) and ǫ(tij) has mean 0 and variance 1 for each tij ∈ [0, 1]. Instead of assuming indepen-

    dence, in this subsection, we assume that ǫ(tij) follow the AR(1) model

    ǫ(tij) = φǫ(tij − ω) + e(tij), for i = 1, ...,m, j = 1, ..., J, (11)

    where φ is a coefficient, ω is the basic time unit defined in Section 2.2, and e(t) is a white noise

    process over [0, 1]. The model (11) is the conventional AR(1) model for cases with equally spaced

    observation times. It does not have a constant term on the right-hand-side because the mean of

    ǫ(tij) is 0. In the current DySS problem, it is inconvenient to work with the model (11) directly

    because there may not be observations at the times tij − ω that are used in (11). To overcome

    12

  • this difficulty, we can transform the AR(1) model (11) to the following time series model using the

    actual observation times {tij}:

    ǫ(tij) = φ∆i,j−1ǫ(ti,j−1) + Θ̃ij(z)e(tij), (12)

    where ∆i,j−1 = (tij − ti,j−1)/ω, z is the lag operator used in time series modeling (e.g., zǫ(t) =

    ǫ(t− ω)), and

    Θ̃ij(z) = 1 + φz + · · ·+ φ∆i,j−1−1z∆i,j−1−1

    is a lag polynomial. As in Sections 2.1 and 2.2, let µ̂(t; Ṽ ) and σ̂2y(t) be the local pth-order

    polynomial kernel estimators of µ(t) and σ2y(t), and

    ǫ̂(tij) =y(tij)− µ̂(tij ; Ṽ )

    σ̂y(tij), for i = 1, 2, . . . ,m, j = 1, 2, . . . , J.

    Then, the least squares (LS) estimator of φ, denoted as φ̂, can be obtained by minimizing

    m∑

    i=1

    J∑

    j=2

    {ǫ̂(tij)− φ

    ∆i,j−1 ǫ̂(ti,j−1)}2

    .

    Next, we use the estimated IC longitudinal pattern described by µ̂(t; Ṽ ), σ̂2y(t) and φ̂ to monitor

    a new individual’s longitudinal pattern. Assume that the new individual’s y observations follow

    the model (6), in which the error term ǫ(t∗j ) follows the AR(1) model

    ǫ(t∗j ) = φǫ(t∗j − ω) + e(t

    ∗j ), for j = 1, 2, . . .

    where φ is the same coefficient as the one in (11). Then, similar to the relationship between models

    (11) and (12), the above AR(1) model implies that

    ǫ(t∗j ) = φ∆∗j−1ǫ(t∗j−1) + e

    ∗j , (13)

    where ∆∗j−1 = (t∗j − t

    ∗j−1)/ω, e

    ∗j = Θ̃

    ∗j (z)e(t

    ∗j ), and

    Θ̃∗j (z) = 1 + φz + · · ·+ φ∆∗j−1−1z∆

    j−1−1.

    It can be checked that, when the new individual is IC, {e∗j , j = 1, 2, . . .} are i.i.d. with mean 0 and

    variance σ2e∗ = 1 − φ2∆∗j−1 . In the SPC literature, it has been well demonstrated that, when ǫ(t∗j )

    follows the time series model (13), to detect an upward mean shift in ǫ(t∗j ), we can just monitor

    {e∗j , j = 1, 2, . . .} (cf., e.g., Lu and Reynolds 2001). Based on all these results, to detect an upward

    13

  • mean shift in the original response y, the charting statistic of the upward CUSUM chart can be

    defined by

    C+j = max

    [0, C+j−1 +

    (ǫ̂(t∗j )− φ̂

    ∆∗j−1 ǫ̂(t∗j−1))/√

    1− φ̂2∆∗

    j−1 − k

    ], for j ≥ 2, (14)

    where C+1 = 0, k > 0 is an allowance constant, and ǫ̂(t∗j ) = (y(t

    ∗j ) − µ̂(t

    ∗j ; Ṽ ))/σ̂y(t

    ∗j ). The chart

    gives a signal when

    C+j > ρ, (15)

    where ρ > 0 is a control limit chosen to achieve a given ATS0 value.

    At the end of this subsection, we would like to point out that there actually are two different

    ways to estimate µ(t), σ2y(t), and φ from an IC data when the AR(1) model (12) is involved.

    Because the estimation procedure described in Section 2.1 does not require the error covariance

    function V (s, t) to be known, the first way is to estimate µ(t) and σ2y(t) by the four steps described

    in Section 2.1 without using any prior information about the error structure, and then estimate φ

    by the LS procedure described above. Alternatively, in the case when we are sure that the AR(1)

    model (12) is appropriate for describing the error structure, a parametric formula of V (s, t) can be

    derived in terms of φ, denoted as Vφ(s, t). Then, µ(t) and σ2y(t) can be estimated by µ̂(t;Vφ) and

    σ̂2y,φ(t), respectively, where σ̂2y,φ(t) is the same as σ̂

    2y(t) except that µ̂(t;Vφ) instead of µ̂(t; Ṽ ) is

    used in its construction. Then, φ can be estimated by the LS estimator defined by the solution of

    minφ

    m∑

    i=1

    J∑

    j=2

    {ǫ̂φ(tij)− φ

    ∆i,j−1 ǫ̂φ(ti,j−1)}2

    ,

    where

    ǫ̂φ(tij) =y(tij)− µ̂(tij ;Vφ)

    σ̂y,φ(tij), for i = 1, 2, . . . ,m, j = 1, 2, . . . , J.

    We have checked that results by these two approaches are close to each other as long as the IC

    sample size is not small. In all simulation examples discussed in Section 3, the second approach is

    used, while the first approach is used in the real data example discussed in Section 4 because we

    are not sure whether the AR(1) model is appropriate at the beginning of data analysis.

    2.4 Cases with arbitrary autocorrelation

    In the previous two parts, we assume that the original observations {y(t∗j ), j = 1, 2, . . .} of

    a new individual are either independent or correlated with the autocorrelation described by the

    14

  • AR(1) model (11). Also, the observation distribution is assumed to be normal. In practice, all

    these assumptions could be violated. If one or more such model assumptions are violated, then the

    related control charts (8)-(9) and (14)-(15) may not be reliable because their actual ATS0 values

    could be substantially different from the assumed ATS0 values. See related discussions in Hawkins

    and Deng (2010), Qiu and Hawkins (2001), Qiu and Li (2011a,b), and Zou and Tsung (2010).

    In this subsection, we propose a numerical approach to compute the control limit ρ of the chart

    (8)-(9) from an IC data such that the chart is still appropriate to use in cases when within-subject

    observations are correlated with an arbitrary autocorrelation and when a parametric form of their

    distribution is unavailable.

    Assume that there is an IC dataset consisting of a group of m well-functioning individuals

    whose observations of y follow the model (1). The data of the first m1 individuals are used for

    obtaining estimators µ̂(t; Ṽ ) and σ̂2y(t), as discussed in Section 2.1. Then, the control limit ρ of the

    chart (8)-(9) can be computed from the remaining m2 = m−m1 individuals using a block bootstrap

    procedure (e.g., Lahiri 2003) described below. First, we compute the standardized observations of

    the m2 well-functioning individuals by

    ǫ̂(tij) =y(tij)− µ̂(tij ; Ṽ )

    σ̂y(tij), for i = m1 + 1,m1 + 2, . . . ,m, j = 1, 2, . . . , J.

    Second, we randomly select B individuals with replacement from the m2 well-functioning individ-

    uals, and use their standardized observations to compute the value of ρ by a numerical searching

    algorithm so that a given ATS0 level is reached. Such a searching algorithm has been well described

    in Qiu (2008) and Qiu (2013, Section 4.2). Basically, we first give an initial value to ρ, and compute

    the actual ATS0 value of the chart (8)-(9) from the standardized observations of the B resampled

    individuals. If the actual ATS0 value is smaller than the given ATS0 value, then the value of ρ

    is increased; otherwise, the value of ρ is decreased. Then, the actual ATS0 value of the chart is

    computed again, using the updated value of ρ. This process is repeated until the given ATS0 level

    is reached to a certain accuracy. In all examples of this paper, we choose m1 = m/2.

    It should be pointed out that, if it is reasonable to assume that the within-subject observations

    are independent, then we can also draw bootstrap samples in the conventional way from the set

    of residuals {ǫ̂(tij), i = m1 + 1,m1 + 2, . . . ,m, j = 1, 2, . . . , J}. Namely, we can randomly select J

    residuals with replacement from this set as the standardized observations of a bootstrap resampled

    “individual”, and B bootstrap resampled individuals can be generated in that way for computing

    the value of ρ. When the within-subject observations are correlated, it has been demonstrated in

    15

  • the literature that a block bootstrap procedure, such as the one described above, would be more

    appropriate to use (cf., Lahiri 2003).

    3 Numerical Study

    In this section, we present some simulation results to investigate the numerical performance of

    the proposed DySS procedure described in the previous section. In estimating the mean function

    µ(t) and the variance function σ2y(t) of the longitudinal response y(t) from an IC data by the local

    smoothing method described in Section 2.1, p is fixed at 1 (i.e., the local linear kernel smoothing is

    used), the kernel function is chosen to be the Epanechnikov kernel functionK(t) = 0.75(1−t2)I(|t| ≤

    1) that is widely used in the curve estimation literature, and all bandwidths are chosen by the CV

    procedure.

    In the first example, it is assumed that the IC mean function µ(t) and the IC variance function

    σ2y(t) are known to be

    µ(t) = 1 + 0.3t1

    2 , σ2y(t) = µ2(t), for t ∈ [0, 1],

    observations of a given individual at different time points are independent of each other, and the

    sampling rate of different individuals are the same to be d observations every 10 basic time units. In

    such cases, the standardized observations are defined in (10), and the control limit ρ of the CUSUM

    chart (8)-(9) can be easily computed by a numerical algorithm, after its allowance constant k and

    the ATS0 value are pre-specified (cf., Table 1 and the related description in Subsection 2.2). Now,

    let us consider the mean shifts from µ(t) to

    µ1(t) = µ(t) + δ, for t ∈ [0, 1],

    that occur at the initial time point, where δ is the shift size taking values from 0 to 2.0 with a step

    of 0.1. The basic time unit in this example is w = 0.001, and the sampling rate d is chosen to be 2,

    5, or 10. In the CUSUM chart (8)-(9), its ATS0 value is fixed at 100, and the allowance constant

    k is chosen to be 0.1, 0.2 or 0.5. The corresponding ρ values are computed to be 2.820, 2.351, and

    1.493 when d = 2; 4.612, 3.713, and 2.234 when d = 5; and 6.407, 4.937, and 2.852 when d = 10.

    The ATS1 values of the chart in cases when d = 2, 5, 10, computed based on 10,000 replicated

    simulations, are shown in Figure 1(a)-(c). From the plots, it can be seen that: (i) the chart with a

    smaller (larger) k performs better for detecting small (large) shifts, and (ii) the ATS1 values tend

    16

  • to be smaller when d is larger. The first result is generally true for CUSUM charts (cf., Hawkins

    and Olwell 1998), and the second result is intuitively reasonable because more observations are

    available for process monitoring when d is chosen larger.

    0.0 0.5 1.0 1.5 2.0delta

    Ave

    rage

    Tim

    e to

    Sig

    nal

    25

    1540

    100

    k=0.1k=0.2k=0.5

    (a)

    0.0 0.5 1.0 1.5 2.0delta

    k=0.1k=0.2k=0.5

    (b)

    0.0 0.5 1.0 1.5 2.0delta

    k=0.1k=0.2k=0.5

    (c)

    0.0 0.5 1.0 1.5 2.0delta

    Ave

    rage

    Tim

    e to

    Sig

    nal

    2535

    5070

    100

    k=0.1k=0.2k=0.5

    (d)

    0.0 0.5 1.0 1.5 2.0delta

    k=0.1k=0.2k=0.5

    (e)

    0.0 0.5 1.0 1.5 2.0delta

    k=0.1k=0.2k=0.5

    (f)

    Figure 1: (a)-(c) ATS1 values of the chart (8)-(9) in cases when the IC process mean function µ(t)and the process variance function σ2y(t) are assumed known, k = 0.1, 0.2, 0.5, ATS0=100, step shiftsize δ changes its value from 0 to 2 with a step of 0.1, and the sampling rate d = 2 (plot (a)), d = 5(plot (b)), and d = 10 (plot (c)). (d)-(f) Corresponding results when the mean shift is a nonlineardrift from µ(t) to µ1(t) = µ(t) + δ(1 − exp(−10t)), for t ∈ [0, 1]. The y-axes of all plots are innatural log scale.

    In practice, a more realistic scenario would be that, once a shift occurs, the shift size would

    change over time. In the total cholesterol level example discussed in Section 1, if a patient’s total

    cholesterol level begins to deviate from the regular pattern and no medical treatments are given in

    a timely fashion, then his/her total cholesterol level could deviate from the regular pattern more

    and more over time. To simulate such scenarios, we consider a nonlinear drift from µ(t) to

    µ1(t) = µ(t) + δ (1− exp(−10t)) , for t ∈ [0, 1]

    that occurs at the initial time point, where δ is a constant. When δ changes from 0 to 2.0 with a

    17

  • step of 0.1 and other settings remain the same as those in Figure 1(a)-(c), the ATS1 values of the

    chart (8)-(9) are shown in Figure 1(d)-(f). From the plots, similar conclusions to those from Figure

    1(a)-(c) can be made here. We also performed simulations in cases when w = 0.005 and ATS0 = 20.

    The results are included in a supplementary file, and they demonstrate a similar pattern to that in

    Figure 1.

    Next, we consider a more realistic situation when the IC mean function µ(t) and the IC variance

    function σ2y(t) are both unknown and they need to be estimated from an IC dataset. Assume that

    the IC dataset includes observations of m individuals. For each individual, the sampling rate is d,

    which is assumed to be the same as that of the new individual for online monitoring, and the basic

    time unit is ω = 0.001. Then, when d = 2, each individual in the IC dataset has 200 observations

    in the design interval [0, 1]. After µ(t) and σ2y(t) are estimated from the IC dataset, observations of

    the new individual are standardized by (7) for online monitoring. The standardized observations

    {ǫ̂(t∗j )} are regarded as i.i.d. random variables with the common distribution N(0, 1) when the new

    individual is IC. When ATS0, k and d are given, the control limit ρ can be computed as if µ(t) and

    σ2y(t) are known (cf., Table 1). Then, the actual ATS0 value of the chart (8)-(9) is computed using

    10,000 simulations of the online monitoring. This entire process, starting from the generalization of

    the IC data to the computation of the actual ATS0 value, is then repeated 100 times. The averaged

    actual ATS0 values in cases when the nominal ATS0 = 100, k = 0.1, 0.2 or 0.5, d = 2, 5 or 10, and

    m = 5, 10 or 20 are presented in Figure 2, along with the ATS1 values of the chart for detecting

    step shifts of size δ = 0.25, 0.5, 0.75, and 1.0, occurring at the initial time point. These ATS0 and

    ATS1 values are also presented in Table S.1 of the supplementary file. From the figure and the

    table, it can be seen that: (i) the actual ATS0 values are within 5% of the nominal ATS0 value of

    100 in all cases when m ≥ 10, (ii) the actual ATS0 values are closer to the nominal ATS0 value

    when d is larger, (iii) the ATS1 values are smaller if δ is larger or d is larger, (iv) the ATS1 values

    do not depend on the m value much, especially when δ and d are large, and (v) the ATS1 values

    are the smallest when k = 0.1 and δ is small (i.e., δ = 0.25), or when k = 0.2 and δ is medium

    (e.g., δ = 0.5 or 0.75), or when k = 0.5 and δ is large (i.e., δ = 1.0). The corresponding results for

    detecting the drifts considered in Figure 1(d)-(f) are presented in Figure S.2 and Table S.2 of the

    supplementary file. Results in cases when ATS0 = 20 are presented in Tables S.3 and S.4 of the

    supplementary file. Similar conclusions can be made from all these results.

    In the above examples, the sampling rate d is chosen as 2, 5, and 10. In cases when ω = 0.001,

    18

  • 0.0 0.2 0.4 0.6 0.8 1.0

    020

    4060

    8010

    0

    delta

    ATS

    m=5m=10m=20

    0.0 0.2 0.4 0.6 0.8 1.0delta

    m=5m=10m=20

    0.0 0.2 0.4 0.6 0.8 1.0delta

    m=5m=10m=20

    0.0 0.2 0.4 0.6 0.8 1.0

    020

    4060

    8010

    0

    delta

    ATS

    m=5m=10m=20

    0.0 0.2 0.4 0.6 0.8 1.0delta

    m=5m=10m=20

    0.0 0.2 0.4 0.6 0.8 1.0delta

    m=5m=10m=20

    0.0 0.2 0.4 0.6 0.8 1.0

    020

    4060

    8010

    0

    delta

    ATS

    m=5m=10m=20

    0.0 0.2 0.4 0.6 0.8 1.0delta

    m=5m=10m=20

    0.0 0.2 0.4 0.6 0.8 1.0delta

    m=5m=10m=20

    Figure 2: Actual ATS0 and ATS1 values of the chart (8)-(9), for detecting step mean shift of thesize δ occurring at the initial time point, in cases when the IC mean function µ(t) and the ICvariance function σ2y(t) are estimated from an IC data with m individuals, m = 5, 10 or 20, k = 0.1(1st column), 0.2 (2nd column) or 0.5 (3rd column), d = 2 (1st row), 5 (2nd row) or 10 (3rd row),ω = 0.001, and the nominal ATS0 is 100.

    19

  • each individual would have 200, 500, and 1000 observations, respectively, in the design interval

    [0, 1]. In some applications, such as the total cholesterol level example discussed in Section 4, the

    individuals have fewer observations. Next, in the example of Figure 2, we choose d to be 0.2, 0.5,

    and 1, m to be 10, 20, and 40, and the other parameters to be unchanged. The corresponding

    results of the actual ATS0 and ATS1 values of the chart (8)-(9) for detecting the step shifts are

    shown in Table S.5 and Figure S.3 of the supplementary file. The results for detecting the drifts

    are shown in Table S.6 and Figure S.4 of the supplementary file. From these figures and tables, it

    can be seen that most conclusions made above from Figures 2 and S.2 are still valid here, except

    that (i) for a given m value, the actual ATS0 values in this case seem a little farther away from

    the nominal ATS0 value of 100, compared to the actual ATS0 values shown in Figures 2 and S.2,

    (ii) it requires m ≥ 20 in this case so that the actual ATS0 values would be within about 5% of

    the nominal ATS0 value, and (iii) the actual ATS1 values in this case are generally larger than the

    corresponding values shown in Figures 2 and S.2. All these results are intuitively reasonable.

    In the literature, there are some existing control charts designed specifically for detecting time-

    varying mean shifts, or mean drifts. One type of such control charts is based on the idea to design

    a chart adaptively using a dynamic prediction of the shift size. Such adaptive CUSUM or EWMA

    charts (cf., Capizzi and Masarotto 2003, Shu and Jiang 2006, Sparks 2000) can be used in our

    proposed DySS method in place of the conventional CUSUM chart (8)-(9). Another control chart

    designed for detecting time-varying mean shifts is the reference-free cuscore (RFCuscore) chart by

    Han and Tsung (2006). Next, we consider using the adaptive CUSUM chart by Shu and Jiang

    (2006) and the RFCuscore chart by Han and Tsung (2006) in our DySS method. By the adaptive

    CUSUM chart, the charting statistic C+j in (8) needs to be changed to

    C+j,A = max[0, C+j−1,A +

    (ǫ̂(t∗j )− δ̂

    +j /2

    )/ρ(δ̂+j /2)

    ], for j ≥ 1,

    where C+0,A = 0, δ̂+j is the predicted shift size at time j defined by

    δ̂+j = max(δ+min, (1− λ)δ̂

    +j−1 + λǫ̂(t

    ∗j )), for j ≥ 1,

    δ̂+0 = δ+min, δ

    +min is a parameter denoting the minimum shift size, λ ∈ (0, 1] is a weighting parameter,

    ρ(k) is a control limit defined by

    ρ(k) = [log(1 + 2k2ARL0 + 2.332k)]/2k − 1.166,

    and ARL0 = ATS0 ∗ d/10 is the pre-specified IC ARL value. The chart gives a signal at time j

    when C+j,A > 1. As in Shu and Jiang (2006), we choose λ = 0.1 in this procedure. The RFCuscore

    20

  • chart gives a signal at time j if

    max1≤w≤j

    j∑

    i=j−w+1

    |ǫ̂(t∗i )| (ǫ̂(t∗i )− |ǫ̂(t

    ∗i )|/2)

    ≥ c,

    where c is a control limit chosen to reach a given ATS0 level. In the example of Figures 2 and

    S.2, the actual ATS0 and ATS1 values of the adaptive CUSUM chart and the RFCuscore chart

    are computed in the same way as those of the chart (8)-(9), in cases when δ+min is chosen to be

    0.2, 0.4, and 1.0 and other parameters are chosen to be the same values as those in Figures 2 and

    S.2. These values are presented in Tables S.7-S.9 and Figures S.5-S.7 of the supplementary file. In

    cases when m = 20, k = 0.2 in the chart (8)-(9), and δ+min = 0.4 in the adaptive CUSUM chart,

    the computed actual ATS0 and ATS1 values of the three charts are shown in Figure S.8 in the

    supplementary file, for detecting both step shifts and drifts. From these results, it can be seen

    that (i) the conventional CUSUM chart (8)-(9) and the adaptive CUSUM chart perform almost

    identically in all cases considered, and (ii) the RFCuscore chart performs slightly worse in certain

    cases, especially when detecting drifts.

    The supplementary file contains some results in cases when observations at different time

    points are correlated and their error terms follow the AR(1) model (11). These results show that

    the CUSUM chart (14)-(15), which is based on the standardized error terms, is indeed effective

    in detecting mean shifts in the original process observations. In cases when µ(t), σ2y(t) and φ are

    unknown and they need to be estimated from an IC dataset, it seems that more IC data are needed

    to make the actual ATS0 values of the chart (14)-(15) to be within 10% of the nominal ATS0 value,

    compared to the results in Figure 2 when within-subject observations are independent. Finally,

    in the supplementary file, we also consider an example in which the within-subject observations

    are correlated, but the correlation does not follow the AR(1) model (11). The block bootstrap

    procedure discussed in Section 2.4 is used in designing our proposed DySS method. From the

    results in this example, it can be seen that the DySS method still performs well in such cases.

    4 Application to the Total Cholesterol Level Example

    In this section, we apply our proposed DySS method to the total cholesterol level example that

    is briefly described in Section 1. This example is a part of the SHARe Framingham Heart Study

    of the National Heart Lung and Blood Institute (cf., Cupples et al. 2007). In the data, there are

    21

  • 1028 non-stroke patients (i.e. m = 1028) and 27 stroke patients. In the study, each patient was

    followed 7 times (i.e., J = 7), and the total cholesterol level (in mg/100ml) was recorded at each

    time. Because the total cholesterol level, denoted as y, is an important risk factor of stroke, it is

    important to detect its irregular temporal pattern so that medical interventions can be applied in a

    timely manner and stroke can be avoided. Here, we demonstrate that our proposed DySS method

    can be used for identifying irregular total cholesterol level patterns in this example.

    First, the observed data of the 1028 non-stroke patients are used as the IC data, from which

    we can estimate the IC mean function µ(t) and the IC variance function σ2y(t). To this end, the

    covariance function V (s, t) is first estimated by (5), and then µ(t) and σ2y(t) are estimated by

    µ̂(t; Ṽ ) and σ̂2y(t), respectively, as described in Section 2.1. The 95% pointwise confidence band

    of µ(t), defined to be µ̂(t; Ṽ ) ± 1.96σ̂y(t), and µ̂(t; Ṽ ) are presented in Figure 3 with the observed

    total cholesterol levels of the 27 stroke patients. From the figure, it can be seen that only 6 out of

    27 stroke patients are detected to have irregular total cholesterol level patterns by this confidence

    interval approach because their observed total cholesterol levels exceed the upper confidence levels

    at least once. From the figure, it can also be seen that there are no stroke patients whose observed

    total cholesterol levels exceed the lower confidence levels.

    After µ(t) and σ2y(t) are estimated, we can compute the standardized residuals by

    ǫ̂(tij) =y(tij)− µ̂(tij ; Ṽ )

    σ̂y(tij), for i = 1, ...,m, j = 1, ..., J.

    To address the possible autocorrelation among {ǫ̂(tij)}, we consider the AR(1) model (11) (see also

    (12)), and the coefficient φ in the model is estimated by the LS procedure described in Section 2.3

    to be φ̂ = 0.9217. The goodness-of-fit of the AR(1) model is studied as follows. First, the predicted

    values of the model (cf., (12))

    ǫ̃(tij) = φ̂∆i,j−1 ǫ̂(ti,j−1)

    are ordered from the smallest to the largest and then divided into g groups of the same size. Namely,

    the kth group includes all predicted values in the interval Ik = [qk−1, qk), for k = 1, 2, . . . , g, where

    qk is the (k/g)th quantile of the predicted values {ǫ̃(tij)}, for k = 1, 2, . . . , g − 1, q0 = −∞, and

    qg = ∞. Let Ok be the proportion of all standardized residuals {ǫ̂(tij)} in the interval Ik, for

    k = 1, 2, . . . , g. Then, the following Pearson’s χ2 statistic:

    X2 =

    g∑

    k=1

    (Ok − 1/g)2

    1/g

    22

  • 20 30 40 50 60 70 80

    5010

    015

    020

    025

    030

    035

    0

    Age

    Tota

    l Cho

    lest

    erol

    Lev

    el (

    g/m

    l)

    Figure 3: The 95% pointwise confidence band of the mean total cholesterol level µ(t) (dashedcurves), the point estimator µ̂(t; Ṽ ) of µ(t) (dark solid curve), and the observed total cholesterollevels of the 27 stroke patients (thin curves).

    measures the discrepancy between the distribution of the standardized residuals and the distribution

    of their predicted values by the AR(1) model. If the AR(1) model fits the data well, then the

    distribution of X2 should be approximately χ2g−2, where the degree of freedom is g − 2 because

    there is one parameter (i.e., φ) in the AR(1) model that needs to be estimated beforehand. If we

    choose g = 12, then the observed value of X2 is 0.2265, giving a p-value of about 1.0. We also

    tried other g values in the interval [5, 50], and similarly large p-values were obtained. Therefore,

    the AR(1) model seems to fit the standardized residuals well.

    After µ(t), σ2y(t) and φ are all estimated, next we design a CUSUM chart to detect irregular

    longitudinal patterns of the total cholesterol levels of the 27 stroke patients. In this example,

    because we are mainly concerned about the upward shifts in patients’ total cholesterol levels, only

    the upward CUSUM chart (14)-(15) is considered here. Since each stroke patient has an average

    of 2.3 observations every 10 years (cf., Figure 3), we choose d = 2. In the CUSUM chart, we

    choose k = 0.1 and ATS0 = 25. In such cases, the control limit ρ is computed to be 0.927. The

    CUSUM charts for monitoring the 27 stroke patients are presented in Figure 4, from which we can

    see that 22 out of the 27 stroke patients are detected to have upward mean shifts. The 22 signal

    23

  • times, computed from the starting point of each process monitoring, are listed in Table 2. The

    average signal time is 13.818 years. Compared to the confidence interval approach described at

    the beginning of this section, it can be seen that our DySS approach is more effective in detecting

    irregular longitudinal patterns of the total cholesterol levels.

    Table 2: Signal times (years) of the 22 stroke patients by our proposed CUSUM chart (14)-(15).

    Patient ID Signal time Patient ID Signal time

    2 12 16 193 13 17 124 12 18 85 22 19 126 11 20 167 8 21 89 23 22 911 24 23 1112 8 24 1613 7 25 2615 8 27 19

    5 Concluding Remarks

    We have described the DySS screening system for dynamically identifying individuals with

    irregular longitudinal patterns. This approach tries to make use of all available information when

    monitoring the longitudinal pattern of an individual, including the information from an IC popula-

    tion of certain well-functioning individuals and all history data of the individual being monitored,

    using both longitudinal data analysis and SPC techniques. Our numerical examples have shown

    that it is effective in various cases. Although this paper focuses on the monitoring of the mean

    function of the longitudinal response, similar DySS screening systems can be developed for mon-

    itoring the variance function and other features of the response that may affect the longitudinal

    performance of the individuals in a specific application. This approach can also be generalized to

    cases when multiple responses are present.

    In this paper, we have discussed cases when the observation times are regularly spaced, ob-

    servation times are irregularly spaced, and the observations at different time points are correlated.

    The cases when observation times are irregularly spaced, the observations at different time points

    are correlated, and the correlation needs to be described by a parametric time series model are

    24

  • j

    Cj+

    0

    1

    2

    3

    4

    5

    1 2 3 4 5 6

    patient 19 patient 20

    1 2 3 4 5 6

    patient 21 patient 22

    1 2 3 4 5 6

    patient 23 patient 24

    1 2 3 4 5 6

    patient 25 patient 26

    1 2 3 4 5 6

    patient 27

    patient 10 patient 11 patient 12 patient 13 patient 14 patient 15 patient 16 patient 17

    0

    1

    2

    3

    4

    5

    patient 180

    1

    2

    3

    4

    5

    patient 1

    1 2 3 4 5 6

    patient 2 patient 3

    1 2 3 4 5 6

    patient 4 patient 5

    1 2 3 4 5 6

    patient 6 patient 7

    1 2 3 4 5 6

    patient 8 patient 9

    Figure 4: CUSUM charts (14)-(15) with (k, h,ATS0) = (0.1, 0.927, 25) for monitoring the 27 strokepatients. The dashed horizontal lines denote the control limit ρ.

    especially challenging, because there is little existing discussion in the literature about time series

    modeling in such cases. In the paper, only the case when the AR(1) model (11) is appropriate

    is discussed. More future research is required to handle cases with more flexible parametric time

    series models.

    Supplementary Materials

    supplemental.pdf: This pdf file discusses some statistical properties of the estimator µ̂(t; Ṽ )

    defined in Section 2.1, some technical details about the adjustment procedure (15) in Li

    (2011) and the CV procedure discussed in Section 2.1, and some extra numerical results.

    ComputerCodesAndData.zip: This zip file contains MATLAB source codes of our proposed

    DySS method and the stroke data used in the paper.

    25

  • Acknowledgments: We thank the editor, the associate editor, and two referees for many

    constructive comments and suggestions, which greatly improved the quality of the paper. This

    research is supported in part by an NSF grant and two grants from Natural Sciences Foundation

    of China.

    References

    Apley, D.W., and Tsung, F. (2002), “The autoregressive T 2 chart for monitoring univariate auto-

    correlated processes,” Journal of Quality Technology, 34, 80–96.

    Brockwell, P., Davis, R., and Yang, Y. (2007), “Continuous-time Gaussian autoregression,” Sta-

    tistica Sinica, 17, 63–80.

    Capizzi, G., and Masarotto, G. (2003), “An adaptive exponentially weighted moving average

    control chart,” Technometrics, 45, 199–207.

    Chen, K., and Jin, Z. (2005), “Local polynomial regression analysis of clustered data,” Biometrika,

    92, 59–74.

    Cupples, L.A. et al. (2007), The Framingham Heart Study 100K SNP genome-wide association

    study resource: overview of 17 phenotype working group reports,” BMC Medical Genetics,

    8, (Suppl 1): S1.

    Ding, Y., Zeng, L., and Zhou, S. (2006), “Phase I analysis for monitoring nonlinear profiles in

    manufacturing processes,” Journal of Quality Technology, 38, 199–216.

    Gan, F.F. (1995), “Joint monitoring of process mean and variance using exponentially weighted

    moving average control charts,” Technometrics, 37, 446–453.

    Han, D., and Tsung, F. (2006), “A reference-free cuscore chart for dynamic mean change detection

    and a unified framework for charting performance comparison,” Journal of the American

    Statistical Association, 101, 368–386.

    Hawkins, D.M., and Deng, Q. (2010), “A nonparametric change-point control chart,” Journal of

    Quality Technology, 42, 165–173.

    26

  • Hawkins, D.M., and Olwell, D.H. (1998), Cumulative Sum Charts and Charting for Quality Im-

    provement, New York: Springer-Verlag.

    Jensen, W.A., and Birch, J.B. (2009), “Profile monitoring via nonlinear mixed models,” Journal

    of Quality Technology, 41, 18–34.

    Jiang, W. (2004), “Multivariate control charts for monitoring autocorrelated processes,” Journal

    of Quality Technology, 36, 367–379.

    Jiang, W., Tsui, K.L., and Woodall, W. (2000), “A new SPC monitoring method: the ARMA

    chart,” Technometrics, 42, 399–410.

    Knoth, S. (2011), “spc: Statistical Process Control,” R package version 0.4.0, http://CRAN.R-

    project.org/package=spc.

    Lahiri, S.N. (2003), Resampling Methods for Dependent Data, New York: Springer.

    Li, Y. (2011), “Efficient semiparametric regression for longitudinal data with nonparametric co-

    variance estimation,” Biometrika, 98, 355–370.

    Lu, C.W., and Reynolds, M.R., Jr. (2001), “Control charts for monitoring an autocorrelated

    process,” Journal of Quality Technology, 33, 316–334.

    Ma, S., Yang, L., and Carroll, R. (2012), “A simultaneous confidence band for sparse longitudinal

    regression,” Statistica Sinica, 22, 95–122.

    Maller, R.A., Müller, G., and Szimayer, A. (2008), “GARCH modelling in continuous time for

    irregularly spaced time series data,” Bernoulli, 14, 519–542.

    Montgomery, D. C. (2009), Introduction To Statistical Quality Control (6th edition), New York:

    John Wiley & Sons.

    Moustakides, G.V. (1986), “Optimal stopping times for detecting changes in distributions,” The

    Annals of Statistics, 14, 1379–1387.

    Pan, J., Ye, H., and Li, R. (2009), “Nonparametric regression of covariance structures in lon-

    gitudinal studies,” Technical Report, 2009.46, School of Mathematics, The University of

    Manchester, UK.

    Qiu, P. (2013), Introduction to Statistical Process Control, London: Chapman & Hall/CRC.

    27

  • Qiu, P. (2008), “Distribution-free multivariate process control based on log-linear modeling,” IIE

    Transactions, 40, 664–677.

    Qiu, P., and Hawkins, D. (2001), “A rank based multivariate CUSUM procedure,” Technometrics,

    43, 120–132.

    Qiu, P., and Li, Z. (2011a), “On nonparametric statistical process control of univariate processes,”

    Technometrics, 53, 390–405.

    Qiu, P., and Li, Z. (2011b), “Distribution-free monitoring of univariate processes,” Statistics and

    Probability Letters, 81, 1833–1840.

    Qiu, P., Zou, C., and Wang, Z. (2010), “Nonparametric profile monitoring by mixed effects mod-

    eling (with discussions),” Technometrics, 52, 265–277.

    Reynolds, M.R., Jr., Amin, R.W., and Arnold, J.C. (1990), “CUSUM charts with variable sam-

    pling intervals,” Technometrics, 32, 371–384.

    Shu, L., and Jiang, W. (2006), “A Markov chain model for the adaptive cusum control chart,”

    Journal of Quality Technology, 38, 135–147.

    Sparks, R.S. (2000), “CUSUM charts for signalling varying location shifts,” Journal of Quality

    Technology, 32, 157–171.

    Vityazev, V.V. (1996), “Time series analysis of unequally spaced data: the statistical properties

    of the Schuster periodogram,” Astronomical and Astrophysical Transactions, 11, 159–173.

    Yao, F., Müller, H.G., and Wang, J.L. (2005), “Functional data analysis for sparse longitudinal

    data,” Journal of the American Statistical Association, 100, 577–590.

    Yeh, A.B., Lin, D.K.J., and Venkataramani, C. (2004), “Unified CUSUM charts for monitoring

    process mean and variability,” Quality Technology and Quantitative Management, 1, 65–86.

    Zhao, Z., and Wu, W. (2008), “Confidence bands in nonparametric time series regression,” Annals

    of Statistics, 36, 1854–1878.

    Zou, C., and Tsung, F. (2010), “Likelihood ratio-based distribution-free EWMA control charts,”

    Journal of Quality Technology, 42, 1–23.

    28

    IntroductionThe DySS MethodEstimation of the regular longitudinal patternSPC charts for dynamically identifying irregular individualsCases when the longitudinal observations follow AR(1) modelCases with arbitrary autocorrelation

    Numerical StudyApplication to the Total Cholesterol Level ExampleConcluding Remarks