Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford...
Transcript of Introduction to Longitudinal Data Analysis · 2012-04-27 · Analysis of Longitudinal Data. Oxford...
Introduction to Longitudinal Data Analysis
Öþôçò ÓéÜííçòÐáíåðéóôÞìéï Áèçíþí, ÔìÞìá Ìáèçìáôéêü
April 27, 2012
Bibliography
• Weiss Robert(2005). Modeling Longitudinal Data. Springer.
• Diggle P.J., Heagerty P., Liang KY and Zeger S.(2002). Analysis of Longitudinal Data.Oxford Statistical Science Series.
• Fitzmaurice G.M., Laird N.M. and Ware J.H.(2004) Applied Longitudinal Analysis. Wiley.
• Davis C.(2002). Statistical Methods for the Analysis of Repeated Measures. Springer.
• Crowder M.J. and Hand D.J.(1990) Analysis of Repeated Measures. Chapman & Hall.
Longitudinal Data Analysis 1
Introduction
• We are familiar with the assumptions behind linear regression models, that observationsare independent.
• The de�ning feature of Longitudinal Studies is that measurements of the same individualare collected repeatedly over time.
• As a result, observations on the same individual must be associated.
• Hence, the assumption of independent observations cannot be justi�ed.
Longitudinal Data Analysis 2
• The availability of repeated measurement on the same subjects at several time pointscertainly o�ers more information compared to cross-sectional studies.
• Longitudinal studies allows the study of change over time (within subject change).
• The primary goals is studies of this kind are:
{ characterize the change of response over time{ investigate the factors that in uence it
• Responses can be either univariate or multivariate (here we focus on univariate responses).
Longitudinal Data Analysis 3
●
●
●
●
●
●
●
●
●
●
10 12 14 16 18
6065
7075
8085
9095
Age
Rea
ding
Abi
lity
Longitudinal Data Analysis 4
●
●
●
●
●
●
●
●
●
●
10 12 14 16 18
6065
7075
8085
9095
Age
Rea
ding
Abi
lity
●
●
●
●
●
●
●
●
●
●
10 12 14 16 18
6065
7075
8085
9095
Age
Longitudinal Data Analysis 5
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
10 12 14 16 18
6065
7075
8085
9095
Longitudinal Data Analysis 6
Example: CD4+ Cell Numbers (macs data; Diggle et.al.)
The Human Immune de�ciency Virus (HIV) causes AIDS by attacking and reducing CD4+cells and hence reducing a person's ability to �ght infection.
• An uninfected individual has around 1100 cells per millilitre of blood
• CD4+ decrease in number with time from infection
• CD4+ number can be used to monitor disease progression
We have 2376 values of CD4+ cell number from 369 infected individuals. We plot CD4+values against time since seroconversion (time since HIV becomes detectable). [Multi-centerAIDS cohort study of MACS (Kaslow et.al. 1987)]
Longitudinal Data Analysis 7
−2 0 2 4
050
010
0015
0020
0025
0030
00
Years since seroconversion
CD
+ c
ell n
umbe
rs
Longitudinal Data Analysis 8
−2 0 2 4
050
010
0015
0020
0025
0030
00
Years since seroconversion
CD
+ c
ell n
umbe
rs
Longitudinal Data Analysis 9
Example: Treatment of Lead Exposed Children (TLC) Trial
(Fitzmaurice et.al.)
• The TLC trial was a placebo-controlled, randomized study of succimer (a chelating agent)in children with blood lead levels of 20-44 micrograms/dL.
• These data consist of four repeated measurements of blood lead levels obtained at baseline(or week 0), week 1, week 4, and week 6 on 100 children who were randomly assigned tochelation treatment with succimer or placebo.
Group Baseline Week 1 Week 4 Week 6
Succimer 26.5 13.5 15.5 20.8(5.0) (7.7) 7.8) (9.2)
Placebo 26.3 24.7 24.1 23.6(5.0) (5.5) (5.8) (5.6)
Table 1: Mean blood lead levels (sd) from the TLC trial.
Longitudinal Data Analysis 10
●
●
●
●
●
●
●●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
0 1 2 3 4 5 6
010
2030
4050
Time(weeks)
Mea
n B
lood
lead
leve
l
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●●
●●
●
●
●
●
●
●●
●
●
●
●
●
●●●●
●
●
●
●
●
●
●●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
●●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●●
●
●●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●●
●
●
●
●●
Longitudinal Data Analysis 11
●
●
●
●
●
●
●●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
0 1 2 3 4 5 6
010
2030
4050
Time(weeks)
Mea
n B
lood
lead
leve
l
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●●
●●
●
●
●
●
●
●●
●
●
●
●
●
●●●●
●
●
●
●
●
●
●●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
●●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●●
●
●●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●●
●
●
●
●●●
●
● ●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●● ●
●
Longitudinal Data Analysis 12
●
●
●
●
0 1 2 3 4 5 6
1015
2025
30
Time(weeks)
Mea
n B
lood
lead
leve
l●
●●
●
SuccimerPlacebo
Longitudinal Data Analysis 13
Example: Small Mice Data (Weiss)
• Weights in milligrams of new-born male mice.
• All from mothers from a single strain.
• 14 mice measured every 3 days.
• Measurements from day 2 up to day 20.
• Balanced data set.
Longitudinal Data Analysis 14
R Console Page 1
Group id weight.2 weight.5 weight.8 weight.11 weight.14 weight.17 weight.201 3 22 190 388 621 823 1078 1132 11918 3 23 218 393 568 729 839 852 100415 3 24 141 260 472 662 760 885 87822 3 25 211 394 549 700 783 870 92529 3 26 209 419 645 850 1001 1026 106936 3 27 193 362 520 530 641 640 75143 3 28 201 361 502 530 657 762 88850 3 29 202 370 498 650 795 858 91057 3 30 190 350 510 666 819 879 92964 3 31 219 399 578 699 709 822 95371 3 32 225 400 545 690 796 825 83678 3 33 224 381 577 756 869 929 99985 4 34 187 329 441 525 589 621 79692 4 35 278 471 606 770 888 1001 1105
Longitudinal Data Analysis 15
Longitudinal Data Analysis 16
R Console Page 1
Group id weight day1 3 22 190 22 3 22 388 53 3 22 621 84 3 22 823 115 3 22 1078 146 3 22 1132 177 3 22 1191 208 3 23 218 29 3 23 393 510 3 23 568 811 3 23 729 1112 3 23 839 1413 3 23 852 1714 3 23 1004 2015 3 24 141 216 3 24 260 517 3 24 472 818 3 24 662 1119 3 24 760 1420 3 24 885 1721 3 24 878 20
Longitudinal Data Analysis 17
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
● ●
● ●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●● ●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
5 10 15 20
200
400
600
800
1000
1200
Day
Wei
ght
Longitudinal Data Analysis 18
●
●
●
●
●
●
●
5 10 15 20
200
400
600
800
1000
1200
Days
Wei
ght
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
● ●
● ●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●● ●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
Longitudinal Data Analysis 19
Distinctive Feature
Longitudinal Data are clustered. Clusters of data are created from the repeatedmeasurement obtained from the same subject/individual at di�erent times/occassions.
• This feature implies that observations of this kind are correlated, and 'common sense' saysthat they are positively correlated.
• This correlation usually is not of interest.
• However, this correlation needs to be accounted in the analysis, because it invalidated the'common' assumption of independent observations.
• Between subject observations are NOT correlated.
• Clustered data can arise in many di�erent ways. Family, school, hospital, and householddata are clusters that produce correlated data.
Longitudinal Data Analysis 20
Objectives
Important role in health sciences.
• Investigate heterogeneity among individual (genetic, social, behavioral).
• Investigate changes in response over time. This is not possible in cross-sectional studies,where within and between subjects factors that in uence the changes over time cannot bedistinguished.
• Relate changes to covariates.
• Make predictions about how speci�c individuals change over time.
Longitudinal Data Analysis 21
Terminology
• In a LD study, the units being studied are referred to subjects or individuals.
• Individuals are measured at di�erent times or occasions.
• The number of repeated observations and their timing can vary between studies and/orindividuals.
{ A study where all individuals have the same number of observations, usually at the sameoccasions, is called balanced.
{ The opposite leads to an unbalanced study (the 'norm' for LD studies).
• Missing data are very common, leading to incomplete data.
• Data can be collected prospectively (advisable) or retrospectively (often poor quality data).
Longitudinal Data Analysis 22
Balanced Studies
• Clinical trial measuring the e�cacy of an analgesic agent, taking repeated measures ofself-reported pain scale at baseline and at the end of six 15-min intervals.
• Usually when the length of time is short or when humans are not the main subject ofinvestigation (ex. rats).
Unbalanced Studies
• When arthritis patients visit the clinic at 6-month intervals, either miss a visit or the timingis never exactly at 6 months (6-12 months).
• Most health related studies.
Longitudinal Data Analysis 23
Notation
• Let Yij denote the response of the i-individual (i = 1; :::; N) at j-occasion (j = 1; :::; n).(this notation is su�cient of measurements are equally separated)
• Given that we have n repeated measures for each individual, we can group them in a n×1vector
Yi =
Yi1Yi2...Yin
or Yi = (Yi1; Yi2; : : : ; Yin)
′.
• Interest lies on the mean response and how this changes with covariates (treatment group,age, sex,...)
�j = E(Yij):
Longitudinal Data Analysis 24
If we allow the mean response to di�er across individuals, then
�ij = E(Yij):
Longitudinal Data Analysis 25
Data Structures
The general layout is
• N subjects
• from which we get n repeated measures
• at times ti
• Yij response of interest from subject i at occasion j
• with covariates xij = (xij1; xij2; :::; xijp). Generally the number of covariates may varyacross the repeated measurements
• Missing indicator
�ij =
{1; if Yij and xij are observed;0; ...missing.
Longitudinal Data Analysis 26
Layout for the one-sample case
OccasionSubject 1 j n
1 y11 y1j y1t... ... ... ...i yi1 yij yit... ... ... ...N yN1 yNj yNn
Longitudinal Data Analysis 27
Time Missing
Subject Point Indicator Response Covariates
1 1 ä11 y11 x111 : : : x11p...
......
... . . . ...
j ä1j y1j x1j1 : : : x1jp...
......
... . . . ...
t1 ä1t1y1t1
x1t11 : : : x1t1p
........................................................................................
i 1 äi1 yi1 xi11 : : : xi1p...
......
... . . . ...
j äij yij xij1 : : : xijp...
......
... . . . ...
ti äiti yiti xiti1 : : : xitip
........................................................................................
n 1 än1 yn1 xn11 : : : xn1p...
......
... . . . ...
j änj ynj xnj1 : : : xnjp...
......
... . . . ...
tn änt1 ynt1 xntn1 : : : xntnp
Table 2: General layout for repeated measurements
Longitudinal Data Analysis 28
Time Point
Group Subject 1 j t
1 1 y111 y11j y11t...
......
...
i y1i1 y1ij y1it...
......
...
n1 y1n11 y1n1jy1n1t
........................................................................................
h 1 yh11 yh1j yh1t...
......
...
i yhi1 yhij yhit...
......
...
nh yhnh1 yhnhj yhnht
........................................................................................
s 1 ys11 ys1j ys1t...
......
...
i ysi1 ysij ysit...
......
...
ns ysns1 ysnsj ysnst
Table 3: Layout for the special case of multiple samples
Longitudinal Data Analysis 29
Dependence & Correlation
Consider a simple LD design that is balanced and complete, with n measurements of theresponse variable at a common set of occasions on N individuals.
• Expectation: �ij = E(Yij):
• Variance: �2j = E{[Yij − E(Yij)]
2} = E{(Yij − �ij)2}:
• Covariance: �jk = E{(Yij − �ij)(Yik − �ik)}:
• Correlation: �jk =E{(Yij−�ij)(Yik−�ik)}
�j�k:
Longitudinal Data Analysis 30
• We anticipate observations on the same individual to be positively correlated. Thus
Cov
Yi1Yi2...Yin
=
V ar(Yi1) Cov(Yi1; Yi2) : : : Cov(Yi1; Yin)
Cov(Yi2; Yi1) V ar(Yi2) : : : Cov(Yi2; Yin)... ... . . . ...
Cov(Yin; Yi1) Cov(Yin; Yi2) : : : V ar(Yin)
=
�11 �12 : : : �1n
�21 �22 : : : �21... ... . . . ...
�n1 �n2 : : : �nn
where:
• Cov(Yij; Yik) = �jk = �kj = Cov(Yik; Yij);
• �kk = Cov(Yik; Yik) = V ar(Yik) = �2k :
Longitudinal Data Analysis 31
Hence, the covariance matrix takes the simple form
Cov(Yi) =
�2
1 �12 : : : �1n
�21 �22 : : : �21
... ... . . . ...�n1 �n2 : : : �2
n
;
and equally we can de�ne the correlation matrix
Corr(Yi) =
1 �12 : : : �1n
�21 1 : : : �21... ... . . . ...�n1 �n2 : : : 1
;
whereCorr(Yij; Yik) = �jk = �kj = Corr(Yik; Yij):
Longitudinal Data Analysis 32
Example: TLC Trial (cont.)
Objective: Investigate whether treatment with succimer reduced blood lead levels overtime, relative to any changes observed in the placebo group.
H0 : �j(S) = �j(P ); for all j = 1; :::4;
where �j(S) and �j(P ) are the succimer and placebo mean responses at the jth occasion.
Alternatively,
H0 : �j(S)− �1(S) = �j(P )− �1(P ); for all j = 2; :::4;
which states that the changes in the mean response from baseline are equal in the twotreatments.
Note: The second version of the null hypothesis discusses the changes in the means,while there might be di�erences at baseline. Hence, is implied by the �rst null hypothesis,making the second less restrictive.
Longitudinal Data Analysis 33
Restrict attention to the placebo group and let's explore the interdependence of the fourmeasures of blood lead level. First, explore the time plot
●
●
●
●
0 1 2 3 4 5 6
1520
2530
35
Time(weeks)
Mea
n B
lood
lead
leve
l ●
●
●●
Longitudinal Data Analysis 34
while secondly we can explore the pairwise scatter-plots for children in the placebo group
●●
●
●
●
●
●
●
●●
●
●
●●
●
●
●●
●●
● ●●
●
●●●
●
●
●
●
●
●
●●
●
●●
●●●●
● ●
●
●
●
●
●●
10 20 30 40 5010
3050
Baseline
Wee
k 1
●●
●
●
●
●
●●●●●
●
●●
●
●
●● ●
●
●
●●
●●
●●●
●
●●
●
●
●
●●
●●
●
●
●●●● ●
●
●
●
● ●
10 20 30 40 50
1030
50
Baseline
Wee
k 4
●●
●
●
●
●
● ●● ●●
●
●●
●
●
●● ●
●
●
●●
●●
●●●
●
●●
●
●
●
●●
●●
●
●
●●●● ●●
●
●
● ●
10 20 30 40 5010
3050
Week 1
Wee
k 4
●●●
●
●
●●
●●●
●
●●
●●
●●
● ●
●
●●
●
●●
●● ●
●●●
●
●
●●
●
●●
●
●
●●
●
● ●●
●
●● ●
10 20 30 40 50
1030
50
Baseline
Wee
k 6
●●●
●
●
●●
●●●
●
●●
●●
●●
● ●
●
●●
●
●●
●● ●
●●●
●
●
●●
●
●●
●
●
●●
●
● ●●
●
●● ●
10 20 30 40 50
1030
50
Week 1
Wee
k 6
●●●
●
●
●●
●●●
●
●●
●●
●●
●●
●
●●
●
●●
●●●
●●●
●
●
●●
●
●●
●
●
●●●
●●●
●
●●●
10 20 30 40 5010
3050
Week 4
Wee
k 6
Longitudinal Data Analysis 35
The estimated covariances are
Cov(Yi) =
25:2 22:8 24:3 21:422:8 29:8 27:0 23:424:3 27:0 33:1 28:221:4 23:4 28:2 31:8
;
and correlations
Corr(Yi) =
1 0:83 0:84 0:76
0:83 1 0:86 0:760:84 0:86 1 0:870:76 0:76 0:87 1
:
Longitudinal Data Analysis 36
What if we ignore the correlation in the analysis?
A natural estimate of the change in the mean response is
� = �2 − �1;
where �j = 1N
∑Ni=1 Yij: For the treatment group we have �2− �1 = 13:5− 26:6 = −13.
To obtain an estimate of it's standard error we calculate
V ar(�) = V ar
{1
N
N∑i=1
(Yi2 − Yi1)
}=
1
N(�2
1 + �22 − 2�12);
which in our case becomes
V ar(�) =1
50(25:2 + 58:9− 2(15:5)) = 1:06:
Longitudinal Data Analysis 37
If we had simply ignored the existing correlation, then
• We would implicitly assume that �12 = 0, and hence
V ar(�) =1
50(25:2 + 58:9) = 1:68;
which is approximately 1.6 times larger
• This lead to wide con�dence intervals and p-values for the test of H0 : � = 0 that aretoo large.
In summary
• the correlation between the observations is a good thing
• failure to take account of the correlation in the analysis could lead to misleading scienti�cinferences
Longitudinal Data Analysis 38
Pros
• Investigate pattern of change
• Subjects serve as their own controls since response variable is measured under control(baseline) and experimental conditions
• Data collected from the same subjects are more reliable
• While we can address the same questions as in a cross-sectional study, in LD analysis wecan separate what is called cohort and age e�ects
Longitudinal Data Analysis 39
Cons
• Complications in the analysis due to the correlation between observations
• The investigator not always controls the circumstances
{ unbalanced designs{ missing data (pattern!)
Longitudinal Data Analysis 40
Start: Plots/Graphical Presentation
Initially we discuss how we can analyze data that come from one population/group. Weintend to explore
• how observations change over time
• what may in uence possible changes over time.
Initially assume that we have a balanced study. This is an important and reasonableassumption for some kind of analyses which cannot adjust for some forms of irregularities.
For example, it is very important to have observations at regular time points, so quantitieslike the mean response at a speci�c occasion can be calculated.
Assume we have the data from the TLC study, only from the Placebo group.
Longitudinal Data Analysis 41
1. No matter what we are planing to do with the analysis of LD data, the �rst step is alwaysthe creation of a scatterplot. For the balanced TLC data we have
●
●
●
●
●
●
●●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
0 1 2 3 4 5 6
010
2030
4050
Time(weeks)
Mea
n B
lood
lead
leve
l
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●●
●●
●
●
●
●
●
●●
●
●
●
●
●
●●●●
●
●
●
●
●
●
●●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
●●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●●
●
●●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●●
●
●
●
●●
Longitudinal Data Analysis 42
while for the unbalanced macs data we have
−2 0 2 4
050
010
0015
0020
0025
00
Years since seroconversion
CD
+ c
ell n
umbe
rs
Longitudinal Data Analysis 43
2. As soon as we explore the scatterplot we plot the time or pro�le plot. For the TLC datathere is little hope of getting something very useful out of it (usually the case),
●
●●
●
0 1 2 3 4 5 6
010
2030
4050
Time(weeks)
Mea
n B
lood
lead
leve
l
●
●●
●● ●
● ●
●
●●
●
●
●
●
●
●
● ● ●
● ●
●●
●
●
●
●
●
●
● ●
●●
●
●●
●
●
●
●
●
●●
●
●●
●●
● ●
●
● ●● ●
● ●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
● ●
●
●
●
●●
●
●●
●●● ●
●
●
●
●●
●●
●
●
●●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
● ●●●
●
●●
● ●
●●
●
●●
●
●●
●
●
●● ●
●
●●
● ●
● ●
●●
●●
●
●
● ●
● ●
●
●
● ●
●●
● ●
● ●
●
●●
●●
●
●
●
●●
●
●
● ●
●
●●
●
●
●
●
●
●
● ●
●
●● ●
●
●●
●
●
Longitudinal Data Analysis 44
The smallmice scatterplot de�nitely provides with some intuition about the nature of thedata
●
●
●
●
●
●
●
5 10 15 20
200
400
600
800
1000
1200
Days
Wei
ght
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
● ●
● ●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●● ●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
Longitudinal Data Analysis 45
while for the unbalanced macs data we have a seriously messy situation
●
●
●
−2 0 2 4 6
050
010
0015
0020
0025
0030
00
Years since seroconversion
CD
+ c
ell n
umbe
rs
●
●
●
●
●
●
●
●
●
●
●
● ●
● ●
●●
●
●
●
●
●
●
● ●●
●
●
●
●
●
● ●●
●
●
●
●
●
●
●
● ●●
●
●●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●●
●
●●●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
● ● ●
●
●
● ●●
●
●
●
● ●
●●
●●
●
●
●
●●
●●
● ● ●
●
●
●
●
●●
●
●
●
●
●
●●
●
●●
●
●
●●
●● ●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●● ●
●
● ●●
●
●
●
●
●
●
●● ●
●
●
●
●
●
●
●
● ●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●●
●●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
● ● ●
●
●
●
●●
●● ●
●
●
●
●
●
●
●
●●
●
● ●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
● ●
●●
●
● ●
●
●
●
●
●
● ●
●
●
●
● ●
●
●
●●
●
●
●
●●
●● ● ●
●
●●
●
●
●
●
●●
● ●●
●
● ●
●●
●
●●
●●
●
● ●
● ●
●
●●
●
●
●
● ●
●
●●
●●
● ●
●
●●● ●
● ●●
● ● ● ●●
● ●
●
●●
●●
● ● ● ●
●
●●
●●
●
●
●
● ●●
●
●
●
●
●
●
●●
●
● ●●
●
●●
● ●
●●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
● ●
●
●●
●●
●
●
●●
●
●
●● ●
●
●●
●
● ●●
●●
●
●
● ● ●
●
●
●
●
●
●
●
●●
● ●
● ●● ● ●
●
●
● ●
●
●
●●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
● ●●
●
●
●
●
●
●
●
●
●●
● ●
●
● ●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●●
●
● ●
●●
●
●
●
●
●
●
● ●
● ●●
● ●
● ●
●
●●
●
●●
● ●
● ●
●● ●
●● ●
●●
●
●●
●
●●
●
●
● ●
●
●
●
●
●
● ●
●● ●
●●
●
●
● ●
●● ●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●●
● ● ●
●
●
●
●
●
●●
●
●
●
●
●
●●
●●
●●
●●
●
●●
●
●
●
●
● ●
●
●
●
●
● ● ●
●
●● ●
●
●
● ●
●
●
●●
●
●
●
●● ●
●
●
●●
●
●● ● ●
● ●● ●
●
●●●
●
●●
●
●
●
●
●
●
●
● ● ● ●
●●
●
●●
●
●
● ●●
●
● ●●
●
●● ●
● ●● ●
● ●
●●
● ●●
● ●
●●
●
●
● ● ●
● ●
●
●
● ●●
●
●
●
● ●
●
●
●
●
●
●
●●
●
●
●
●
● ●
●
●
●●
● ●
●
●
●●
● ● ●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●● ● ●
● ● ●
●
●
●
●
●
●●
●
●●
●
●●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●● ● ●
● ●● ●
●●
●
●●
●
●
● ● ●●
● ● ●
●●●
●●
● ●●
●
●
●
●
● ●
● ●
●● ●
●●
●
● ●●
●●
●
●
●
●
●
● ●●
● ●●
●●
●● ●
●
●
●
● ●
●
●
●
●●
●
● ●●
● ●
●
●
●
● ●
●
●
●●
●●
●●
● ● ●
●●
●
●
●
●
●●
● ●
●
●●
●
●
● ●
●
●
●
●●
●
●●
●●
●
●
●
● ●●
●
●
●
●
●
●
●
● ●
●
●● ●
●
●●
●
● ●
●
●
●
●
●●
●
●
●
●
●
● ● ●
●
● ●●
● ●
● ●
●
●
●
●
●
●
●
●
●
●
●
● ●●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
● ●
●
●● ●
●
●
●●
● ●●
●●
●
●●
●
●
● ●●
●●
●
●●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
● ●
●
●
●
● ●
●
●
●●
Longitudinal Data Analysis 46
One way of getting some useful information out of it is to simply plot a small, random,sample of time plots
●●
●
●
0 1 2 3 4 5 6
010
2030
4050
Time(weeks)
Mea
n B
lood
lead
leve
l
●●
●
●
●
●
●
●●
● ●
●
●
● ●
●
●
●
●●
Longitudinal Data Analysis 47
while for the unbalanced macs data we have
●
●
●
●
●
●
−2 0 2 4 6
050
010
0015
0020
0025
0030
00
Years since seroconversion
CD
+ c
ell n
umbe
rs
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
● ●
●
●
●
●
● ●
●
●
●
●
● ●
●
●
● ●●
●
●
●
●
● ●●
●●
●●
●
Longitudinal Data Analysis 48
However, there is always the danger that the chosen time plots are not representative ofthe population. A possible '�x' to this problem is
• Choose a variable• Observe the time plots for di�erent levels of this variable (if this is a binary or factor
variable)• Observe the time plots for the di�erent quantiles of this variable (if this is continuous
variable)• This variable could be one of the explanatory variables
Longitudinal Data Analysis 49
Consider the time plots for the macs data by age quantiles
●
●● ●
−2 0 2 4 60
500
1500
2500
(1)
●
●● ●
●
●●
●
●
●●
●
●
●
●
●● ●
●
●
●
●
●
●● ● ●
●●
●
●
●●
● ●
● ● ● ●●
● ●
●●
●● ●
● ● ●●
●
●●
●● ●
● ●●
●
●● ●
●
● ●
●
●
●
●
●
●● ●
●●
●
● ● ●
●●
●
●● ●● ● ● ● ● ●●
●
●● ● ●
● ●●
●
●
●
●● ● ●
●
●
●
●
●●
●●
●
●
●●
● ● ●
●
●●
●
● ●
●●
●
●●
●
●
● ●●
●
● ● ●
● ●
●
−2 0 2 4 6
050
015
0025
00
(2)
● ●
● ●
●●
●
●
●
●
● ●●
●
●
●●
●
●
●●
●
● ●
●
●●
●● ●
● ●
●●●
●
●● ●●
●●
●
●
●
● ●●
● ●
●
●●
●
● ●
●
●
●
●
●
●●
●
●
●
●
●●
● ●
●
● ●
●
●
●●
●
● ●
● ●●
●●
●●●
● ● ●
●●
●
●
●●● ● ● ●
●
● ●
●
●
●●
●●
●
●●
●
●● ● ● ● ●●
●
●●
●●
● ●●
●
●●
● ●●
●
●
● ●
●
−2 0 2 4 6
050
015
0025
00
(3)
●●
●
●
●
● ●●●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
● ● ●●
●●
●
●
● ●●
●
●
● ●
●●
●●
● ●
●
●
●●
●
●● ●
● ● ●●
●
●
●●
●
● ●
●●
●
●●
●
●●
● ●● ●
● ●
●●
● ● ●
●●
● ●● ● ●
●
● ●
●● ●
●
●●
●
●
● ●
● ●
●●
●● ●
●
●
●
●
●
●●
● ●
●
●
●
●●
●
●
●
●
●
−2 0 2 4 60
500
1500
2500
(4)
●
●
●● ●
●● ●
●●
●
●●
●
●
●●
●
●
●
● ● ●● ●●
●●
●
●
●●
● ● ●
●
●
● ●●
● ●●
●●
●
●
●●
●
●●●
● ●
●●
●
●●
●
●
● ●
●●
●
●
●
●
●
●
●●
●
●●
●
●● ●
●●
● ● ●●
●
●
●
●● ●
●●
●● ●●
● ●
●
●
●
●●
●●
● ●●
●●
●
●
●● ● ●
●
●
●
●●
●
●● ●
●
●
●
●●
●
Longitudinal Data Analysis 50
3. One way to explore changes in response over time is to create boxplots. This is possibleonly in balanced studies, where occasions are common for everybody. For smallmice data
●
●●
●
●
●●
2 5 8 11 14 17 20
200
400
600
800
1000
1200
Days
Wei
ght
Longitudinal Data Analysis 51
Simple Analysis
The apparent complication from the fact that we have repeated measurements could beovercome by summarized the longitudinal data.
1. Perhaps the simplest univariate summary of LD data is the average of the response froma single subject
Yi =
∑nij=1 Yij
ni:
The average Yi is treated as a single response per subject. The analysis then is simpli�edand linear regression and ANOVA techniques can easily be used.
Note: This approach is straight forward in balanced studies. A problem exists in unbalancedstudies where not all of the subjects have the same number of observations. So we could
• average all the available observations per subject and continue• average all the available observations per subject and perform some weighted analysis• ignore them or do something else???
Longitudinal Data Analysis 52
2. Another way of analyzing LD data is by summarizing each pro�le by a slope.
• We treat the set of observations for each subject as a 'separate' population• we regress Yij against tij with ni data points for each of the i subjects• De�ne Y ∗ij = Yij − Yi and t
∗ij = tij − ti, where ti is the mean of the observation times
for subject i. Then the slope can be written
�i =
∑j t∗ijY∗ij∑
j t∗ijt∗ij
:
Then the n slopes are treated as the regular data, and analysis using standard techniquesare being used to analyze these data. For example, two sample t-tests or ANOVA can beused to compare the slopes between two groups or more.
3. Many LD studies are designed to be analyzed as a paired analysis. Hence, if we havedata of the form before treatment and after treatment, then the paired t-test could be theappropriate way for analysis.
Longitudinal Data Analysis 53
Problems with simple analyses
1. E�ciency Lost: This occurs when we do NOT use all the data available to us.
• Omit subjects (NEVER do that)• Omit observations
2. Bias: Can be introduced at many stages and in many di�erent ways.
• by design• by subjects who may drop-out for reasons related to the study• by the analyst through mis-analysis
3. Over-simpli�cation: When we simplify the data ignoring their richness.
Longitudinal Data Analysis 54
Smoothing Techniques
In cases where the occasions of measurement are di�erent, it is helpful to produce a"smoothed" plot of the mean response trend over time, as a summary measure.
• Many of these smoothing techniques estimate the mean response at any time by consideringnot only the observations at this particular occasion, but also the neighboring ones.
• That is, the estimated mean is based on observations takes before, at and after the timeof interest.
• The mean, say, at time t is taken to be a weighted average of the observations in closeproximity or neighborhood of time t.
Longitudinal Data Analysis 55
A: Moving/Running Average
One of the most well-known and simplest approaches is the moving or running average.
• For longitudinal data that are balanced and complete the moving average at time t, saySt, is given by
St =1
N
N∑i=1
k∑j=−k
wjyi;t+j; t = k + 1; :::; n− k
where
{ k is some positive integer (eg k = 1 or k = 2) and
{∑k
j=−k wj = 1.
We refer to 2k + 1 as being the order of the moving average. This expression assumesthat N individuals are measured at the same set of occasions.
• With unbalance and/or incomplete data, a similar expression can be derived.
Longitudinal Data Analysis 56
• The order of the moving average determines a symmetric neighborhood of values used toestimate St.
• The higher the order of the moving average the greater the smoothness of the resultingestimate of the mean time trend.
• Hence, the lower the order of the moving average the greater the roughness of the estimate.
• The wj are positive weights that add up to 1, usually equal. In the case where they arenot equal, they are chosen to decrease symmetrically about some maximum value. Thatis wj = w−j and w0 > w1 > ::: > wk. As a result, observation closer to time t havegreater weight in the calculation of the mean than those further apart.
• Based on this de�nition, the calculation of the moving average is problematic at thebeginning and at the end of time plot. A solution is to amend the summation to rangefrom j = max(−k; 1− t) to j = min(k; n− t) and diving by the by the correspondingsum of the included weights.
Longitudinal Data Analysis 57
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
● ●
● ●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●● ●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
5 10 15 20
200
400
600
800
1000
1200
SmallMice
Day
Wei
ght
●
●
●
●
●
●
●
Mean/DayMoving Average (k=1)
Longitudinal Data Analysis 58
●
●
●
●
● ●
●
●
●
●
● ●●
●
●
●
●
●
● ●
●
●
●
●●
●
●
●
●
● ●
●●
●
●
●
●
●●
●
●●●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●●●
●
●
●
●●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●●●
●●
●
● ●
●
●
●
●
●
●
●
● ●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●● ●
●
●
●●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ● ● ●
●
●
●
●
●
● ●
●
● ●
●
●
●
● ●
●
●●
●
●
●
●
●
●
●
● ●
●
●●
●
●
●
●
●
●● ●
●
●
●●
●
● ●
●●
●
●
● ●
●
●
●
●
●●
●●
● ●
●
●
● ●
●
●
●●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
● ●
●
●
●
●
●
●
●
●●
●
●●
●●
●
●●
●
● ●
●
● ●●
●
●
●●
●
●●
●
●
●●
●
●
●
●
●
●●●
●
●
●
● ●●
●
●
●
●
●
●
●
●●
● ●
●
●
●
●
●
●
●●
●●●
●
● ● ●
●
●
●● ●
● ●
●●
●●●
●●
●
●
●
●
●
●
● ● ●
●●
●
●●
●
●
● ●●
●
●●
●
●● ●
● ●● ●
● ●●
●
●
● ● ●
● ●
●
●
●●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
● ●
● ● ●
●
●
●
●●
●
●●
●
●●
●
●
●
●
●
●
●
●
● ●● ●
●●
●
●
●
●
● ●
●
●
●
● ●
●●
● ●
●●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●●
●●
●●
●
●
●
●
●● ●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
● ●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
● ● ●
●
● ●
●●
●
●
●
● ●●
●
●
● ●
●
●
●
●
●
●
● ●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●●
−2 0 2 4
050
010
0015
0020
0025
00
Years since seroconversion
CD
+ c
ell n
umbe
rsBandwidth
Longitudinal Data Analysis 59
Similarly, but more e�ciently, we can use the kernel smoother
�(t) =
∑mi=1 w(t; ti; h)yi∑mi=1 w(t; ti; h)
;
which is a weighting function that changes smoothly over time and gives more weight toobservations close to time t.
A common weight function is the the Gaussian (normal) Kernel
K(u) = exp(−0:5u2):
Hence:w(t; ti; h) = K{(t− ti)=h};
where h is the bandwidth of the kernel.
In R: bandwidth = The kernels are scaled so that their quartiles (viewed
as probability densities) are at +/- 0.25*bandwidth
Longitudinal Data Analysis 60
●
●
●
●
● ●
●
●
●
●
● ●●
●
●
●
●
●
● ●
●
●
●
●●
●
●
●
●
● ●
●●
●
●
●
●
●●
●
●●●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●●●
●
●
●
●●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●●●
●●
●
● ●
●
●
●
●
●
●
●
● ●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●● ●
●
●
●●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ● ● ●
●
●
●
●
●
● ●
●
● ●
●
●
●
● ●
●
●●
●
●
●
●
●
●
●
● ●
●
●●
●
●
●
●
●
●● ●
●
●
●●
●
● ●
●●
●
●
● ●
●
●
●
●
●●
●●
● ●
●
●
● ●
●
●
●●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
● ●
●
●
●
●
●
●
●
●●
●
●●
●●
●
●●
●
● ●
●
● ●●
●
●
●●
●
●●
●
●
●●
●
●
●
●
●
●●●
●
●
●
● ●●
●
●
●
●
●
●
●
●●
● ●
●
●
●
●
●
●
●●
●●●
●
● ● ●
●
●
●● ●
● ●
●●
●●●
●●
●
●
●
●
●
●
● ● ●
●●
●
●●
●
●
● ●●
●
●●
●
●● ●
● ●● ●
● ●●
●
●
● ● ●
● ●
●
●
●●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
● ●
● ● ●
●
●
●
●●
●
●●
●
●●
●
●
●
●
●
●
●
●
● ●● ●
●●
●
●
●
●
● ●
●
●
●
● ●
●●
● ●
●●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●●
●●
●●
●
●
●
●
●● ●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
● ●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
● ● ●
●
● ●
●●
●
●
●
● ●●
●
●
● ●
●
●
●
●
●
●
● ●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●●
−2 0 2 4
050
010
0015
0020
0025
00
Kernel Smoother (Box)
Years since seroconversion
CD
+ c
ell n
umbe
rs
Bandwidth=0.5 (default)Bandwidth=4
Longitudinal Data Analysis 61
●
●
●
●
● ●
●
●
●
●
● ●●
●
●
●
●
●
● ●
●
●
●
●●
●
●
●
●
● ●
●●
●
●
●
●
●●
●
●●●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●●●
●
●
●
●●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●●●
●●
●
● ●
●
●
●
●
●
●
●
● ●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●● ●
●
●
●●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ● ● ●
●
●
●
●
●
● ●
●
● ●
●
●
●
● ●
●
●●
●
●
●
●
●
●
●
● ●
●
●●
●
●
●
●
●
●● ●
●
●
●●
●
● ●
●●
●
●
● ●
●
●
●
●
●●
●●
● ●
●
●
● ●
●
●
●●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
● ●
●
●
●
●
●
●
●
●●
●
●●
●●
●
●●
●
● ●
●
● ●●
●
●
●●
●
●●
●
●
●●
●
●
●
●
●
●●●
●
●
●
● ●●
●
●
●
●
●
●
●
●●
● ●
●
●
●
●
●
●
●●
●●●
●
● ● ●
●
●
●● ●
● ●
●●
●●●
●●
●
●
●
●
●
●
● ● ●
●●
●
●●
●
●
● ●●
●
●●
●
●● ●
● ●● ●
● ●●
●
●
● ● ●
● ●
●
●
●●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
● ●
● ● ●
●
●
●
●●
●
●●
●
●●
●
●
●
●
●
●
●
●
● ●● ●
●●
●
●
●
●
● ●
●
●
●
● ●
●●
● ●
●●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●●
●●
●●
●
●
●
●
●● ●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
● ●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
● ● ●
●
● ●
●●
●
●
●
● ●●
●
●
● ●
●
●
●
●
●
●
● ●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●●
−2 0 2 4
050
010
0015
0020
0025
00
Kernel Smoother (Gaussian)
Years since seroconversion
CD
+ c
ell n
umbe
rs
Bandwidth=0.5 (default)Bandwidth=3
Longitudinal Data Analysis 62
B: Lowess
One popular method is (robust) LOcally WEighted (polynomial) regreSSion or lowess.
• The lowess estimate at t is understood by imagining there is a 'window' centered at t.
• The lowess estimate of the mean at t is determined by �tting a 'straight' line to the datainside the window and obtaining the predicted value at t from the �tted regression line(using the explanatory variable values for that data point).
• The polynomial is �t using weighted least squares, giving more weight to points near thepoint whose response is being estimated and less weight to points further away.
• The entire lowess curve is obtained by moving the window of �xed width from left to rightand repeating the process at every time.
• The width of the window determines the smoothness. The wider the window the smootherthe curve. This is called the bandwidth.
Longitudinal Data Analysis 63
• The choice of bandwidth involves the classical trade o� between bias and precision.Excessive smoothing decreases the variance of the estimate at the risk of introducing bias.Insu�cient smoothing is unlikely to introduce bias but will produce a variable estimate.
• Many of the details of this method, such as the degree of the polynomial model and theweights, are exible.
References
• Cleveland, W. S. (1979) Robust locally weighted regression and smoothing scatterplots.J. Amer. Statist. Assoc. 74, 829{836.
• Cleveland, W. S. (1981) LOWESS: A program for smoothing scatterplots by robust locallyweighted regression. The American Statistician, 35, 54.
Longitudinal Data Analysis 64
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
● ●
● ●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●● ●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
5 10 15 20
200
400
600
800
1000
1200
SmallMice
Day
Wei
ght
●
●
●
●
●
●
●
lowessMean/Day
Longitudinal Data Analysis 65
●
●
●
●
● ●
●
●
●
●
● ●●
●
●
●
●
●
● ●
●
●
●
●●
●
●
●
●
● ●
●●
●
●
●
●
●●
●
●●●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●●●
●
●
●
●●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●●●
●●
●
● ●
●
●
●
●
●
●
●
● ●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●● ●
●
●
●●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ● ● ●
●
●
●
●
●
● ●
●
● ●
●
●
●
● ●
●
●●
●
●
●
●
●
●
●
● ●
●
●●
●
●
●
●
●
●● ●
●
●
●●
●
● ●
●●
●
●
● ●
●
●
●
●
●●
●●
● ●
●
●
● ●
●
●
●●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
● ●
●
●
●
●
●
●
●
●●
●
●●
●●
●
●●
●
● ●
●
● ●●
●
●
●●
●
●●
●
●
●●
●
●
●
●
●
●●●
●
●
●
● ●●
●
●
●
●
●
●
●
●●
● ●
●
●
●
●
●
●
●●
●●●
●
● ● ●
●
●
●● ●
● ●
●●
●●●
●●
●
●
●
●
●
●
● ● ●
●●
●
●●
●
●
● ●●
●
●●
●
●● ●
● ●● ●
● ●●
●
●
● ● ●
● ●
●
●
●●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
● ●
● ● ●
●
●
●
●●
●
●●
●
●●
●
●
●
●
●
●
●
●
● ●● ●
●●
●
●
●
●
● ●
●
●
●
● ●
●●
● ●
●●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●●
●●
●●
●
●
●
●
●● ●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
● ●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
● ● ●
●
● ●
●●
●
●
●
● ●●
●
●
● ●
●
●
●
●
●
●
● ●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●●
−2 0 2 4
050
010
0015
0020
0025
00
lowess curve [ macs (placebo) ]
Time
CD
4
Longitudinal Data Analysis 66
* (Robust Regression) *
Major problems in regression are the absence of
• normality (parametric)
• common variance
• independence of the errors
Longitudinal Data Analysis 67
Other problems are
• overly in uential data points
• outliers
• inadequate speci�cation of the functional form of the model
• near-linear dependencies amongst the independent variables (collinearity)
• independent variables being subject to errors
Longitudinal Data Analysis 68
Robust regression is a form of regression analysis designed to circumvent somelimitations of traditional parametric and non-parametric methods.
• A simple method of estimating parameters in a regression model that are less sensitive tooutliers than the least squares estimates, is to use least absolute deviations. Even then,gross outliers can still have a considerable impact on the model.
• Another approach to robust estimation of regression models is to replace the normaldistribution with a heavy-tailed distribution. A t-distribution with between 4 and 6 degreesof freedom has been reported to be a good choice in various practical situations.
Longitudinal Data Analysis 69
Revision
i) Normal Distribution
• Univariate N(�; �2). The probability density function is
�(x) =1
�√
2�exp
{−(x− �)2
2�2
}
• Multivariate Np(�;Σ). Let x = (x1; x2; :::; xp)′ a p-component random vector having
a MVN with mean � = (�1; �2; :::; �p)′ and a p× p covariance matrix
Σ =
�11 : : : �1p... . . . ...�p1 : : : �pp
:
Longitudinal Data Analysis 70
The pdf has the form
f(x1; x2; :::; xp) = (2�)−p=2|Σ|−1=2 exp{−0:5(x− �)′Σ−1(x− �)
}:
ii) Maximum Likelihood Estimation
• Independent Observations (simple linear regression)Suppose the data are collected from a series of cross{sectional studies. We have asample of N -individuals at n-occasions, and the data are of the form (Yij;Xij), for theith individual at the jth occasion. The model has the form
Yij = Xij� + eij;
where eij ∼ N(0; �2): Hence
f(yij) =1
�√
2�exp
{−(yij − �ij)
2
2�2
};
Longitudinal Data Analysis 71
and the likelihood function takes the form
L =
N∏i=1
n∏j=1
f(yij):
The log{likelihood then becomes
l = log
N∏i=1
n∏j=1
f(yij)
= −nN
2log(2��2)− 1
2
N∑i=1
n∑j=1
(yij −X′ij�)2
�2;
Longitudinal Data Analysis 72
while the MLE of � (also the OLS estimate) are
� =
N∑i=1
n∑j=1
(XijX′ij)
−1
N∑i=1
n∑j=1
(Xijyij):
Note: In this process we have ignored �2.• Correlated Observations
In this case we have ni observations for the ith subject. Assume Σi is known, and hence
we do not need to estimate it (later we see how we can estimate it). It is assumed thatYi = (Yi1; Yi2; :::; Yini)
′ has a Nni(�i;Σi) distribution. Hence, the log{likelihood canbe written as
l = −K2
log(2�)− 1
2
N∑i=1
log |Σi| −1
2
N∑i=1
(yi −Xi�)′Σ−1i (yi −Xi�);
where K =∑N
i=1 ni is the total number of observations.
Longitudinal Data Analysis 73
Then the estimator of �, known as GLS estimator, can be expressed as
� =
{N∑i=1
(X′iΣ−1i Xi)
}−1 N∑i=1
(X′iΣ−1i yi);
and has the properties:{ Is unbiased:
E(�) = �:
{ Asymptotically has a MVN with mean � and
Cov(�) =
{N∑i=1
(X′iΣ−1i Xi)
}−1
:
Note: Similar asymptotic properties we have when we estimate Σi. However, with smallsample sizes, the sampling distribution of � is adversely in uenced by the number ofcovariance parameters that need to be estimated.
Longitudinal Data Analysis 74
Modelling the Mean: Pro�le Analysis
• Initially, we introduce no structure on the mean response over time.
• Additionally, we set no structure on the covariance among the repeated measures. Thiswill be dealt in details later.
• In order to perform a Pro�le Analysis, we require balanced data, with the timing of therepeated measures common to all individuals in the study.
• Unbalanced designs due to missing data can be handled.
• This kind of analysis is appealing when there is a single categorical covariate (eg. treatmentgroup) and when a speci�c pattern for the di�erences in the response pro�les cannot bespeci�ed.
Longitudinal Data Analysis 75
●
●
●
●
0 1 2 3 4 5 6
1015
2025
30
Time(weeks)
Mea
n B
lood
lead
leve
l●
●●
●
SuccimerPlacebo
Longitudinal Data Analysis 76
Hypotheses:
For simplicity, assume that we have a two-level categorical covariate (two-group design). Anygeneralization should be straight forward.
Hence, the following questions arise:
• Are the pro�les of the groups parallel? In other words, is there a group× time interaction?
• Is there a time e�ect? (under the assumption that the mean response pro�les are parallel)
• Is there a group e�ect? (under the assumption that the mean response pro�les are parallel)
Longitudinal Data Analysis 77
●
●
●
●
1.0 1.5 2.0 2.5 3.0 3.5 4.0
68
12Time
Mea
n R
espo
nse
● ● ● ●
1.0 1.5 2.0 2.5 3.0 3.5 4.0
68
12
Time
Mea
n R
espo
nse
● ● ● ●
1.0 1.5 2.0 2.5 3.0 3.5 4.0
68
12
Time
Mea
n R
espo
nse
Longitudinal Data Analysis 78
Suppose we have the two-group design, where a new treatment (T) is compared to astandard one (C).
Measurement OccasionGroup 1 2 . . . n
Treatment �1(T ) �2(T ) . . . �n(T )Control �1(C) �2(C) . . . �n(C)Di�erence ∆1 ∆2 . . . ∆n
∆j = �j(T)− �j(C)
The null hypothesis is that there is no treatment × time interaction. This means thatthe di�erence in the means between the treatment groups is the same over time. Hence:
H0 : ∆1 = ∆2 = : : : = ∆n:
This provides with a test on (n− 1) degrees of freedom.
Longitudinal Data Analysis 79
In a General Linear Model formulation we have
E(Yi|Xi) = �i = Xi�;
where Xi is an appropriate design matrix for the kind of interpretation we want for the �'s.
Example: If we have n = 3 measurements from two groups, then we require 2× 3 = 6parameters for the means. For group A we have
�1(A) = �1
�2(A) = �2
�3(A) = �3
while for group B we have
�1(B) = �4
�2(B) = �5
�3(B) = �6
Longitudinal Data Analysis 80
Hence, the design matrix for group A has the form
Xi =
1 0 0 0 0 00 1 0 0 0 00 0 1 0 0 0
while for group B
Xi =
0 0 0 1 0 00 0 0 0 1 00 0 0 0 0 1
where � = (�1; �2; : : : ; �6)′ is a 6× 1 vector of regression coe�cients. Hence:
�(A) =
�1(A)�2(A)�3(A)
=
�1
�2
�3
and �(B) =
�1(B)�2(B)�3(B)
=
�4
�5
�6
:
Longitudinal Data Analysis 81
In this way of parameterization we cannot test H0 by simply setting one of the �'s equal tozero, or something simple like that. The null hypothesis of no treatment× time interactioncan be re-expressed as
H0 : (�1 − �4) = (�2 − �5) = (�3 − �6);
and written in a matrix formH0 : L� = 0;
for
L =
(1 −1 0 −1 1 01 0 −1 −1 0 1
):
This expression leads to the following set of equations
{�1 − �2 − �4 + �5 = 0�1 − �3 − �4 + �6 = 0
⇒{
�1 − �4 = �2 − �5
�1 − �4 = �3 − �6
Longitudinal Data Analysis 82
A slightly di�erent way of parameterization is when we use a group as a reference group (thisis the preferred way of parameterization of many statistical software). In this approach thedesign matrices have the form
Xi =
1 0 0 0 0 01 1 0 0 0 01 0 1 0 0 0
for group A, while for group B
Xi =
1 0 0 1 0 01 1 0 1 1 01 0 1 1 0 1
;
where in this case the reference group is the �rst one (group A).
Longitudinal Data Analysis 83
Hence:
�(A) =
�1(A)�2(A)�3(A)
=
�1
�1 + �2
�1 + �3
and
�(B) =
�1(B)�2(B)�3(B)
=
�1 + �4
(�1 + �4) + (�2 + �5)(�1 + �4) + (�3 + �6)
:
As a result, the null hypothesis for no treatment× time interaction now takes the form
H0 : �5 = �6 = 0;
which is a simpler and a more straight forward way of testing H0.
Additionally, testing for the main e�ects (group and time) is straight forward. Therefore,under the assumption of no interaction e�ect, the hypothesis of no time e�ect can be assessedthrough
H′0 : �2 = �3 = 0;
Longitudinal Data Analysis 84
while the group e�ect can be assessed through
H′′0 : �4 = 0:
General Case
In a similar way we can have the parameterization for the case where we have G groupsto compare over n occasions. In the '�rst' parameterization' (no reference group) we canintroduce G dummy (binary) variables, indicators for each one of the G treatment groups
Zig =
{1; if the ith subject belongs to group g;0; otherwise.
Longitudinal Data Analysis 85
Hence: �i(1)�i(2)...
�i(G− 1)�i(G)
=
�1
�2...
�G−1
�G
If, however, we choose to introduce an intercept, say �1, then we need G − 1 dummy
variables. Hence, if we allow group G to be our reference group, we get�i(1)�i(2)...
�i(G− 1)�i(G)
=
�1 + �2
�1 + �3...
�1 + �G�1
Longitudinal Data Analysis 86
How is Done!
• In the Pro�le Analysis we basically have two covariates, both factors! One, say Z1,represents treatment group (G ≥ 2) while the second one, say Z2, is for the n occasionsfor which we have measurements.
• Assume we have G = 2 treatment groups and n = 3 occasions. The 'usual' approach(most stats software) is to introduce by default an intercept into the model. In this casewe need G− 1 = 1 dummy variables for treatment and n− 1 = 2 for occasions.
{ For treatment we have (assuming standard treatment is the reference)
Zi1 =
{1; if the ith subject is on new treatment;0; if the ith subject is on standard treatment.
Longitudinal Data Analysis 87
{ For the occasions (assuming the �rst one is the reference) we have
Zi21 =
{1; indicate observation at the second occasion;0; otherwise.
and
Zi22 =
{1; indicate observation at the third occasion;0; otherwise.
• The model takes the form
{ with no treatment× time interaction
�i = �1 + �2Zi1 + �3Zi21 + �4Zi22
{ with interaction
�i = �1 + �2Zi1 + �3Zi21 + �4Zi22 + �5Zi1Zi21 + �6Zi1Zi22
Longitudinal Data Analysis 88
• For example (model with interaction):
{ If the ith subject is in new treatment and for the second occasion, we have
�i = �1 + �2 + �3 + �5:
{ If the ith subject is in standard treatment and for the third occasion, we have
�i = �1 + �4:
{ While, if the ith subject is in standard treatment and for the �rst occasion, we have
�i = �1:
Longitudinal Data Analysis 89
• The design matrices based on our model are
{ for patients on the new treatment
Xi =
1 1 0 0 0 01 1 1 0 1 01 1 0 1 0 1
;
{ for patients on the standard treatment
Xi =
1 0 0 0 0 01 0 1 0 0 01 0 0 1 0 0
Longitudinal Data Analysis 90
Missing Data
We have mentioned that this type of analysis requires balanced structures. However,missing data can be easily dealt with by constructing the appropriate design matrix.
For example, if a subject attends two of the arrange occasions and misses the third one, thenwe can simply remove the appropriate line from the design matrix.
Hence, if a patient from group A (previous example) misses the third visit, then the designmatrix becomes
Xi =
(1 0 0 0 0 00 1 0 0 0 0
)
Longitudinal Data Analysis 91
Tools & Concepts
Now, we consider how to make inferences about �. More speci�cally this has to do withcon�dence intervals and hypothesis testing.
A. Statistical Inference:In order to estimate � we use the ML in order to get �, with estimated covariance matrix
Cov(�) =
{N∑i=1
(X′iΣiXi
)}−1
;
where Σ, the ML estimate of Σ is being used.
Longitudinal Data Analysis 92
1. Con�dence Intervals: For every single component �k of � we have
�k ± 1:96
√V ar(�k)
for a 95 % Con�dence Interval.
Generally, if L is a vector or matrix of known weights, then
L� ± 1:96
√LCov(�)L′
Longitudinal Data Analysis 93
2. Wald Test: Whenever a relationship within or between data items can be expressed asa statistical model with parameters to be estimated from a sample, the Wald test canbe used to test the true value of the parameter based on the sample estimate. Hence,for testing the hypothesis
H0 : �k = 0
HA : �k 6= 0;
we calculate the following Wald Statistic
Z =�k√
V ar(�k)
can be compared with N(0; 1).In general, con�dence intervals can be constructed for linear combinations of thecomponents of �. Hence, assume that L� represent a set of contrasts of interest.The hypothesis testing takes the form
H0 : L� = 0
Longitudinal Data Analysis 94
HA : L� 6= 0;
and the Ward Statistic becomes
Z ′ =L�√
LCov(�)L′:
Now, if L is a single row vector then LCov(�)L′ is scalar and hence we compare Z ′ tothe standard normal distribution.Furthermore, since Z ′ ∼ N(0; 1), then Z ′2 has a �2 distribution with 1 degree offreedom (df). As a result, an identical test of the above hypothesis uses the statistic
W 2 = (L�)′{LCov(�)L′
}−1
(L�);
and compare W 2 to �21.
However, this formulation helps to generalize (when L has more than one rows), allowing
Longitudinal Data Analysis 95
the simultaneous testing of a multivariate hypothesis. Hence, if L has r rows then asimultaneous test
H0 : L� = 0
HA : L� 6= 0;
is given by
W 2 = (L�)′{LCov(�)L′
}−1
(L�);
which follows a �2 distribution with r df.
This is often referred to as the multivariate Wald test.
Longitudinal Data Analysis 96
3. Likelihood Ratio Test:• The LRT can be used to compare two models, when one model is a special case
(nested) to the other.• The alternative or full model allows some parameters to vary, whereas the null or
reduced model �xes those parameters at known values.• The LRT is then 2 times the di�erence of the log miximized likelihoods for each
model. The alternative of full model (larger model) will always have the larger log-likelihood (lfull), whereas the null or reduced model has lred < lfull. Hence, the teststatistic
G2 = 2(lfull − lred)
is constructed to answer how much larger lfull is from lred. The larger G2 is thestronger the evidence that the smaller model (null) is inadequate.• We compare G2 to a �2 distribution with df equal to the di�erence between the
number of parameters in the two models.
Longitudinal Data Analysis 97
Note 1:Likelihood-based con�dence intervals can be constructed with the use of of the pro�le
likelihood. More speci�cally, for a single component �k of �, the pro�le log-likelihood isobtained by maximizing the log-likelihood over the remaining parameters while keeping�k �xed. Then a 95 % CI is constructed by obtaining the values of �k that satisfy
2{lp(�k)− lp(�k)} ≤ critical value:
Note 2:LRT can be used for covariance parameters. Due to problems with the samplingdistribution of variance parameters, Wald test is not recommended. Even with LRTthere are some problems in comparing nested models for covariance parameters.
Longitudinal Data Analysis 98
B. Restricted (residual) Maximum Likelihood (REML) Estimation:
• Introduced by Patterson & Thompson (1971) as a way of estimating variancecomponents in a GLM.• In ML estimation the log-likelihood function has the form
l = −K2
log(2�)− 1
2
N∑i=1
log |Σi| −1
2
N∑i=1
(yi −Xi�)′Σ−1i (yi −Xi�);
where K =∑N
i=1 ni is the total number of observations.• It is known that the ML estimate of Σi is biased in small samples.• To illustrate, consider the case where observations are independent (from cross-sectional
studies) with constant variance �2. Estimates of both � and �2 come from the
Longitudinal Data Analysis 99
maximization of the log-likelihood function
l = −K2
log(2��2)− 1
2
N∑i=1
n∑j=1
(yij −X′ij�)2
�2
• The MLE of �2 is
�2 =
N∑i=1
n∑j=1
(yij −X′ij�)2
K;
and we know that �2 is a biased estimate of �2
E(�2) =
(K − p
K
)�2;
where p is the dimension of �.
Longitudinal Data Analysis 100
• An unbiased estimate of �2 is
�2 =
N∑i=1
n∑j=1
(yij −X′ij�)2
K − p;
which is known as the REML estimate.• In e�ect, the bias arises from the fact that the ML estimate does not take into account
the fact that � is also being estimated from the same data.
Longitudinal Data Analysis 101
As a result, Restricted (residual) Maximum Likelihood Estimation was developed toaddress this particular problem.
• The main idea is to separate the part of the data that is being used for the estimationof variance parameters.• Hence, we need to eliminate � from the likelihood, so only Σi is left in the likelihood
to be estimated.• One way of doing that is by transforming the data to a set of linear combinations of
observations that have a distribution that does not depend on �.• In the case of GLM with dependent errors the REML estimator is de�ned as a MLE
based on a linearly transformed set of data
Y∗ = AY;
such that the distribution of Y∗ does not depend on �.• For example the residuals after estimating � by OLS can be used to estimate Σi. Hence,
A = I −X(X ′X)−1X ′:
Longitudinal Data Analysis 102
• Then, Y∗ has a singular multivariate Gaussian distribution with mean zero, whateverthe value of �:• The REML estimator of Σi is less biased than the ML estimator. When N is much
larger than p the di�erence becomes less important.• The REML estimator is being used for Σ, while � is estimated by the GLS estimator
� =
{N∑i=1
(X′iΣ−1i Xi)
}−1 N∑i=1
(X′iΣ−1i yi);
by plugging in the REML estimate of Σi.• REML is the default in R (and in many statistical software).
Longitudinal Data Analysis 103
Model Selection
Model selection involves the choice of an appropriate model among a set of candidatemodels.
1. Nested Models: The likelihood ratio (LR) test is used in nested models. This means thatthe reduced model is a special case of the full model. In this case LR test can be seen asa model selection tool, since we can decide whether the additional complication of the fullmodel is worthwhile or the simpler model is equally good in describing the data.
2. Generally: Model selection techniques are useful for screening through many di�erentcovariance models. The goal is to choose the 'best' model for use in further analysis. The(log-) likelihood is once again the driving force behind any selection tool. More speci�cally
• Criterion-based approaches compare adjusted log-likelihoods penalized for the numberof parameters in the model.• The penalty increases with the number of parameters. This is because models with many
parameters should �t better (higher log-likelihood) than models with fewer parameters.
Longitudinal Data Analysis 104
• The penalty is used to level o� this discrepancy.• The most popular selection criteria are{ The Akaike Information Criterion (AIC). For a given model m the AIC is de�ned as
AIC(m) = −2 loglikelihood(m) + 2qm;
where qm is the number of parameters in the model.{ The Bayes Information Criterion (BIC), de�ned as
BIC(m) = −2 loglikelihood(m) + log(N)qm;
where N is the number of observations (sample size).• For covariance models the log-REML is being used.• Model selection proceeds similarly in both criteria.{ We �t the models of interest to the data and then they are ranked according either
their AIC or BIC value.{ The model with the smallest value is selected as best.
Longitudinal Data Analysis 105
Example: TLC Data
• Reshape the datatlc:long = reshape(tlc; idvar = ”id”; varying = c(”lead0”; ”lead1”; ”lead4”; ”lead6”); v:names =
”lead”; direction = ”long”)
• Model 1:
fm1 = lmer(lead ∼ factor(time) + factor(group) + (1|id); data = tlc:long)
• Model 2:
fm2 = lmer(lead ∼ factor(time) + (1|id); data = tlc:long)
Longitudinal Data Analysis 106
R Console Page 1
> fm1Linear mixed-effects model fit by REML Formula: lead ~ factor(time) + factor(group) + (1 | id) Data: tlc.long AIC BIC logLik MLdeviance REMLdeviance 2576 2600 -1282 2569 2564Random effects: Groups Name Variance Std.Dev. id (Intercept) 24.475 4.9472 Residual 24.417 4.9414 number of obs: 400, groups: id, 100
Fixed effects: Estimate Std. Error t value(Intercept) 23.6173 0.8915 26.493factor(time)1 -7.3150 0.6988 -10.468factor(time)4 -6.6140 0.6988 -9.465factor(time)6 -4.2020 0.6988 -6.013factor(group)P 5.5775 1.1060 5.043
Correlation of Fixed Effects: (Intr) fct()1 fct()4 fct()6factor(tm)1 -0.392 factor(tm)4 -0.392 0.500 factor(tm)6 -0.392 0.500 0.500 factr(grp)P -0.620 0.000 0.000 0.000
Longitudinal Data Analysis 107
R Console Page 1
> fm2Linear mixed-effects model fit by REML Formula: lead ~ factor(time) + (1 | id) Data: tlc.long AIC BIC logLik MLdeviance REMLdeviance 2599 2619 -1294 2592 2589Random effects: Groups Name Variance Std.Dev. id (Intercept) 32.022 5.6588 Residual 24.417 4.9414 number of obs: 400, groups: id, 100
Fixed effects: Estimate Std. Error t value(Intercept) 26.4060 0.7513 35.15factor(time)1 -7.3150 0.6988 -10.47factor(time)4 -6.6140 0.6988 -9.46factor(time)6 -4.2020 0.6988 -6.01
Correlation of Fixed Effects: (Intr) fct()1 fct()4factor(tm)1 -0.465 factor(tm)4 -0.465 0.500 factor(tm)6 -0.465 0.500 0.500
Longitudinal Data Analysis 108
R Console Page 1
> anova(fm1,fm2)Data: tlc.longModels:fm2: lead ~ factor(time) + (1 | id)fm1: lead ~ factor(time) + factor(group) + (1 | id) Df AIC BIC logLik Chisq Chi Df Pr(>Ch isq) fm2 5 2602.4 2622.4 -1296.2 fm1 6 2581.4 2605.3 -1284.7 23.069 1 1.563 e-06 ***---Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘. ’ 0.1 ‘ ’ 1
Longitudinal Data Analysis 109
Modelling the mean: Parametric Curves
• As the number of occasions increase and the number of irregular observations increase,Pro�le Analysis becomes less and less appealing.
• Furthermore, it is reasonable in many circumstances to expect that the mean response islikely to change smoothly (monotonically) over time, at least for the duration of the study.
• Fitting parsimonious models for the mean response leads to statistical tests with greaterpower than the Pro�le Analysis (narrower range of alternative hypotheses).
• This, however, is true only if the assumed structure for the mean is 'correct'.
Longitudinal Data Analysis 110
Linear Trends over time
Assume the model
E(Yij) = �1 + �2Timeij + �3Groupi + �4Timeij × Groupi;
• Groupi =
{1; new treatment;0; otherwise.
• Timeij has two indices to allow for mistimed observations.
• Hence:
{ for the control group we have: E(Yij) = �1 + �2Timeij:
{ for experimental treatment group we have: E(Yij) = (�1 + �3) + (�2 + �4)Timeij:
Longitudinal Data Analysis 111
Time
Mea
n R
espo
nse
0 2 4 6 8 10
01
23
45
Control
Treatment
Longitudinal Data Analysis 112
Quadratic Trends over time
Assume the model
E(Yij) = �1 +�2Timeij +�3Time2ij +�4Groupi+�5Timeij×Groupi+�6Time
2ij×Groupi:
• Changes in the mean response are no longer constant. The rate of change now dependson time (earlier/later).
• Hence:
{ for the control group we have:
E(Yij) = �1 + �2Timeij + �3Time2ij:
{ for experimental treatment group we have:
E(Yij) = (�1 + �4) + (�2 + �5)Timeij + (�3 + �6)Time2ij:
Longitudinal Data Analysis 113
Time
Mea
n R
espo
nse
0 2 4 6 8 10
02
46
810
12
Control
Treatment
Longitudinal Data Analysis 114
• There is a natural hierarchy in higher order models. In the quadratic model
E(Yij) = �1 + �2Timeij + �3Time2ij;
�rst we test the quadratic trend (�3 = 0) before we move on to the linear term (�2 = 0).
• It is very important to see how variables enter the model. Centering variables to their meanvalue o�er a simple interpretation to the intercept. Additionally collinearity problems areavoided. For example consider Timej. If Timej ∈ {0; 1; 2; :::; 10} then the correlationbetween Timej and Time2
j is 0.96. However, if we center Time by subtracting its meanvalue 5, then the correlation goes down to zero.
Longitudinal Data Analysis 115
Time-Varying Covariates
• So far we have discussed cases where variables remain unchanged over time.
• The common case where at the �rst visit all subjects are in the same state (say untreated)and any intervention is given from the second visit onwards (TLC data).
• Furthermore, in many trials patients tend to switch treatments for various reasons, usuallyside e�ects or even personal choice (when the treatment cannot be disclosed). As a resultwe need to allow for this change over the duration of the study (cross-over trials).
Longitudinal Data Analysis 116
• In the case of a time-varying treatment indicator the treatment variable, say Gi, will notbe constant over time. Vector
Gi =
00111
;
indicates that patient i started with placebo for the �rst two occasions and then (s)heswitched to active treatment. Extension to cases where we have more than two treatmentgroups are possible with the inclusion of the right number of indicator (dummy) variables.
• In exactly the same way we model a continuous time-varying covariate, where at eachoccasion the right value for this covariate is included in the model.
Longitudinal Data Analysis 117
Other Approaches: Splines
There are cases where longitudinal trends in the mean response cannot be characterized by�rst and second degree polynomials in time. Additionally, there are cases where non-lineartrends cannot be well approximated by polynomials in time of any order. This can happenwhen the mean response can rapidly increase or decrease for some duration and then continuemore slowly. A class of models called Splines are then used to describe these complicatedcurves.
A. Step Function:
• The simplest spline model for the population mean is a sequence of at steps.• In this approach, each step approximates the mean response over a small interval of
time. The result is a step function that approximates the smooth curve of the meanresponse.• The step function parameterization is quite straight forward. Suppose that we have
observations in 9 time points (occasions) from t = 1 to t = 9 and we have three step
Longitudinal Data Analysis 118
functions with steps at 2.5 and 5.5. Then
time V 1 V 2
123456789
=
1 0 01 0 00 1 00 1 00 1 00 0 10 0 10 0 10 0 1
or
1 0 01 0 01 1 01 1 01 1 01 1 11 1 11 1 11 1 1
• In the V 1 parameterization the parameters represent the mean response in the intervals
(1,2.5), (2.5,5.5) and (5.5,9). In the V 2 parameterization, parameter �1 represent themean response in the �rst interval (1,2.5), however �2 represent the di�erence in themean response in the �rst two intervals and �3 the di�erence between the means in the
Longitudinal Data Analysis 119
third and second intervals.
Note A: This parametrization is similar to the one for unstructured mean, with the onlydi�erence being in the fact that a single parameter is the mean for multiple time points.
B. Bent Line (piece-wise linear):
• Another approach, slightly more complicated, is to assume continuous functions, linearon intervals of time, with the slope allowed to change from one interval to the next.• As a result,connected line segments approximate the continuous curve of the mean
response.• The bent line requires two parameters for the �rst interval and one additional parameter,
for the change in slope, for every additional interval.• Hence, a model with two break points at t∗1 and t∗2 can be written as
E(Yij = �1 + �2Timeij + �3(Timeij − t∗1)+ + �4(Timeij − t∗2)+;
where (x)+ is equal to x when x > 0 and zero otherwise.
Longitudinal Data Analysis 120
• The covariate matrix then takes the form
time Bent Line
123456789
=
1 1 0 01 2 0 01 3 0:5 01 4 1:5 01 5 2:5 01 6 3:5 0:51 7 4:5 1:51 8 5:5 2:51 9 6:5 3:5
Parameter �1 is the intercept, �2 is the slope up to time 2.5, �2 + �3 is the slopebetween 2.5 and 5.5 and �nally �2 + �3 + �4 is the slope after 5.5.
Note B: The break points at t = 2:5 and t = 5:5 are formally called knots. For thestep function, the number of parameters we require are 1 plus the number of knots. For
Longitudinal Data Analysis 121
the bent line model the number of parameters required are 2 plus the number of knots.Note C: For the bent line model we often require fewer knots than the step function. Asa result, in practice, the total number of parameters required for the bent line model arefewer than the step function.
C. Higher Order Polynomial Splines:
• Spline models can become even more complicated by using piece-wise quadratic or cubicmodels.• Two parameters characterize spline models{ the order of the piece-wise polynomial on each interval{ the number of knots• If a spline is of kth-order, then for each knot there is a covariate that allows the coe�cient
of the kth-order term tKij to change. For example, at each knot there is a jump at thestep function and in the bent line model there is a change at the slope. The cubic splinemodel has an intercept, slope, a quadratic and a cubic term. At each knot, say t0k, thecubic spline model has a covariate of the form (tij − t0k)
3+. This allows the coe�cient
of the time cubed to change at each knot.
Longitudinal Data Analysis 122
Note D: Generally, the number of parameters is equal to the degree of the polynomialplus the number of knots plus 1.
Longitudinal Data Analysis 123
Modelling the Covariance
• Although the covariance between observations is not of primary interest, accounting forthe covariance among repeated measures usually increases the precision with which theparameters are being estimated.
• Furthermore, when we have missing data, the 'correct' speci�cation of the covariancestructure is often a requirement for valid estimates of the regression parameters.
• There are two aspects that require modelling: the mean and the covariance structure.Although they appear to be independent, an interdependence exist based on the fact thatthe covariance between any pairs of residuals {Yij−�ij(�)} and {Yik−�ik(�)} dependson the model of the mean. As a result, a model for the covariance should be chosen onthe basis of some model for the mean.
Longitudinal Data Analysis 124
A. Unstructured
Cov(Yi) =
�11 �12 · · · �1n
�21 �22 · · · �2n... ... . . . ...
�n1 �n2 · · · �nn
• The above structure is reasonable when the number of occasions is relatively small and
all individuals are measured at the same set of occasions.• Formal requirements:{ symmetric{ positive de�nite• Advantage: No structure in the covariance matrix.
• Drawback 1: many parameters to estimate. We have to estimate n(n+1)2 parameters,
growing rapidly with n. As a result, estimation process can be unstable.• Drawback 2: Problem when we have mistimed observations.
Longitudinal Data Analysis 125
B. Compound Symmetry
Cov(Yi) = �2
1 � · · · �
� 1 · · · �... ... . . . ...� � · · · 1
• Variance is assumed constant �2 and Corr(Yij; Yik) = �.• Advantage: Only two parameters to estimate.• Drawback 1: Makes the strong assumption that the correlation between any pair of
observations is the same, regardless of the time interval between measurements. Thisis rather unappealing for most Longitudinal data, since correlation is expected to decaywith time.• Drawback 2: The assumption of constant variance is also unrealistic. We have seen
that variance increases with time
Longitudinal Data Analysis 126
C. Toeplitz
Cov(Yi) = �2
1 �1 �2 · · · �n−1
�1 1 �1 · · · �n−2
�2 �1 1 · · · �n−3... ... ... . . . ...
�n−1 �n−2 �n−3 · · · 1
• Assume that any pair of responses equally separated in time have the same correlation.• Variance is constant �2 and Corr(Yij; Yij+k) = �k.• Appropriate only when measurements are made at (approximately) equal intervals of
time.• There are n parameters to be estimated.
Longitudinal Data Analysis 127
D. Autoregressive
Cov(Yi) = �2
1 � �2 · · · �n−1
� 1 � · · · �n−2
�2 � 1 · · · �n−3
... ... ... . . . ...�n−1 �n−2 �n−3 · · · 1
• A special case of the Toeplitz covariance structure.• Variance is constant �2 and Corr(Yij; Yij+k) = �k.• Advantage: Only two parameters to estimate.
Longitudinal Data Analysis 128
E. Banded
Cov(Yi) = �2
1 �1 0 · · · 0�1 1 �1 · · · 00 �1 1 · · · 0... ... ... . . . ...0 0 0 · · · 1
• Makes the assumption that the correlation is zero beyond some point.• The above is a banded Toeplitz covariance pattern with a band size of 2.• Variance is constant �2 and Corr(Yij; Yij+k) = 0 for k ≥ 2.• Disadvantage: Makes a very strong assumption about how quickly the correlation
decays.
Longitudinal Data Analysis 129
F. Exponential
• When measurement occasions are not equally spaced, we can generalize theautoregressive pattern by assuming
Corr(Yij; Yij) = �|tij−tik|;
for � > 0.• Thus, correlation decrease exponentially with the time separation between models.• Called exponential because
Corr(Yij; Yij) = �|tij−tik| = exp{−�|tij − tik|};
where � = − log(�).• Invariant under liner transformations.
Longitudinal Data Analysis 130
Example: TLC Data
glsTLC:
leadij = �0 + �1groupi + �2timej + �3(groupi × timej)
with the covariance matrix having the compound symmetry form
Cov(Yi) = �2
1 � � �
� 1 � �
� � 1 �
� � � 1
>gls(lead ∼ factor(group)*factor(time),data=tlc.long,correlation=corCompSymm(form= 1|id))
Longitudinal Data Analysis 131
R Console Page 1
> glsTLC=gls(lead~factor(group)*factor(time),data=tlc.long,correlation=corCompSymm(form=~1|id))> summary(glsTLC)Generalized least squares fit by REML Model: lead ~ factor(group) * factor(time) Data: tlc.long AIC BIC logLik 2480.621 2520.334 -1230.311
Correlation Structure: Compound symmetry Formula: ~1 | id Parameter estimate(s): Rho 0.5954401
Coefficients: Value Std.Error t-value p-value(Intercept) 26.540 0.9370175 28.323911 0.0000factor(group)P -0.268 1.3251428 -0.202242 0.8398factor(time)1 -13.018 0.8428574 -15.445080 0.0000factor(time)4 -11.026 0.8428574 -13.081691 0.0000factor(time)6 -5.778 0.8428574 -6.855252 0.0000factor(group)P:factor(time)1 11.406 1.1919804 9.568950 0.0000factor(group)P:factor(time)4 8.824 1.1919804 7.402807 0.0000factor(group)P:factor(time)6 3.152 1.1919804 2.644339 0.0085
Correlation: (Intr) fct()P fct()1 fct()4 fct()6 f()P:()1 f()P:()4factor(group)P -0.707 factor(time)1 -0.450 0.318 factor(time)4 -0.450 0.318 0.500 factor(time)6 -0.450 0.318 0.500 0.500 factor(group)P:factor(time)1 0.318 -0.450 -0.707 -0.354 -0.354 factor(group)P:factor(time)4 0.318 -0.450 -0.354 -0.707 -0.354 0.500 factor(group)P:factor(time)6 0.318 -0.450 -0.354 -0.354 -0.707 0.500 0.500
Standardized residuals: Min Q1 Med Q3 Max -2.5147478 -0.6973588 -0.1498706 0.5542799 6.5106944
Residual standard error: 6.625714 Degrees of freedom: 400 total; 392 residual
Longitudinal Data Analysis 132
glsTLC2:
leadij = �0 + �1groupi + �2timej + �3(groupi × timej)
with the covariance matrix having the compound symmetry form
Cov(Yi) = �2
s1 � � �
� s2 � �
� � s3 �
� � � s4
>gls(lead ∼ factor(group)*factor(time),data=tlc.long,correlation=corCompSymm(form= 1|id),
weight=varIdent(form= 1|time))
Longitudinal Data Analysis 133
R Console Page 1
> # Compound Symmetry with different variances at different occasions> glsTLC2=gls(lead~factor(group)*factor(time),data=tlc.long,correlation=corCompSymm(form=~1|id),weight=varIdent(form=~1|time))> summary(glsTLC2)Generalized least squares fit by REML Model: lead ~ factor(group) * factor(time) Data: tlc.long AIC BIC logLik 2459.960 2511.587 -1216.980
Correlation Structure: Compound symmetry Formula: ~1 | id Parameter estimate(s): Rho 0.6102797 Variance function: Structure: Different standard deviations per stratum Formula: ~1 | time Parameter estimates: 0 1 4 6 1.000000 1.279651 1.323192 1.519196
Coefficients: Value Std.Error t-value p-value(Intercept) 26.540 0.7238068 36.66724 0.0000factor(group)P -0.268 1.0236174 -0.26182 0.7936factor(time)1 -13.018 0.7506743 -17.34174 0.0000factor(time)4 -11.026 0.7713904 -14.29367 0.0000factor(time)6 -5.778 0.8726864 -6.62094 0.0000factor(group)P:factor(time)1 11.406 1.0616138 10.74402 0.0000factor(group)P:factor(time)4 8.824 1.0909108 8.08865 0.0000factor(group)P:factor(time)6 3.152 1.2341649 2.55395 0.0110
Correlation: (Intr) fct()P fct()1 fct()4 fct()6 f()P:()1 f()P:()4factor(group)P -0.707 factor(time)1 -0.211 0.149 factor(time)4 -0.181 0.128 0.402 factor(time)6 -0.060 0.043 0.383 0.383 factor(group)P:factor(time)1 0.149 -0.211 -0.707 -0.285 -0.270 factor(group)P:factor(time)4 0.128 -0.181 -0.285 -0.707 -0.271 0.402 factor(group)P:factor(time)6 0.043 -0.060 -0.270 -0.271 -0.707 0.383 0.383
Standardized residuals: Min Q1 Med Q3 Max -2.1429187 -0.6927684 -0.1528875 0.5263104 5.5480270
Residual standard error: 5.118087 Degrees of freedom: 400 total; 392 residual
Longitudinal Data Analysis 134
glsTLC3:
leadij = �0 + �1groupi + �2timej + �3(groupi × timej)
with the covariance matrix having the symmetric form
Cov(Yi) =
�2 �12 �13 �14
�21 �2 �24 �24
�31 �32 �2 �34
�41 �42 �43 �2
>gls(lead ∼ factor(group)*factor(time),data=tlc.long,correlation=corSymm(form= 1|id))
Longitudinal Data Analysis 135
R Console Page 1
> summary(glsTLC3)Generalized least squares fit by REML Model: lead ~ factor(group) * factor(time) Data: tlc.long AIC BIC logLik 2471.632 2531.201 -1220.816
Correlation Structure: General Formula: ~1 | id Parameter estimate(s): Correlation: 1 2 3 2 0.596 3 0.582 0.769 4 0.536 0.552 0.551
Coefficients: Value Std.Error t-value p-value(Intercept) 26.540 0.9374730 28.310148 0.0000factor(group)P -0.268 1.3257871 -0.202144 0.8399factor(time)1 -13.018 0.8425878 -15.450023 0.0000factor(time)4 -11.026 0.8576242 -12.856447 0.0000factor(time)6 -5.778 0.9034129 -6.395747 0.0000factor(group)P:factor(time)1 11.406 1.1915990 9.572012 0.0000factor(group)P:factor(time)4 8.824 1.2128637 7.275343 0.0000factor(group)P:factor(time)6 3.152 1.2776188 2.467090 0.0140
Correlation: (Intr) fct()P fct()1 fct()4 fct()6 f()P:()1 f()P:()4factor(group)P -0.707 factor(time)1 -0.449 0.318 factor(time)4 -0.457 0.323 0.719 factor(time)6 -0.482 0.341 0.485 0.492 factor(group)P:factor(time)1 0.318 -0.449 -0.707 -0.508 -0.343 factor(group)P:factor(time)4 0.323 -0.457 -0.508 -0.707 -0.348 0.719 factor(group)P:factor(time)6 0.341 -0.482 -0.343 -0.348 -0.707 0.485 0.492
Standardized residuals: Min Q1 Med Q3 Max -2.5135258 -0.6970199 -0.1497978 0.5540105 6.5075307
Residual standard error: 6.628935 Degrees of freedom: 400 total; 392 residual
Longitudinal Data Analysis 136
glsTLC4:
leadij = �0 + �1groupi + �2timej + �3(groupi × timej)
with the covariance matrix having the compound symmetry form
Cov(Yi) =
�2
1 �12 �13 �14
�21 �22 �24 �24
�31 �32 �23 �34
�41 �42 �43 �24
>gls(lead ∼ factor(group)*factor(time),data=tlc.long,correlation=corSymm(form= 1|id),
weight=varIdent(form= 1|time))
Longitudinal Data Analysis 137
R Console Page 1
> summary(glsTLC4)Generalized least squares fit by REML Model: lead ~ factor(group) * factor(time) Data: tlc.long AIC BIC logLik 2452.076 2523.559 -1208.038
Correlation Structure: General Formula: ~1 | id Parameter estimate(s): Correlation: 1 2 3 2 0.571 3 0.570 0.775 4 0.577 0.582 0.581Variance function: Structure: Different standard deviations per stratum Formula: ~1 | time Parameter estimates: 0 1 4 6 1.000000 1.325887 1.370453 1.524826
Coefficients: Value Std.Error t-value p-value(Intercept) 26.540 0.7102888 37.36508 0.0000factor(group)P -0.268 1.0045001 -0.26680 0.7898factor(time)1 -13.018 0.7919194 -16.43854 0.0000factor(time)4 -11.026 0.8149168 -13.53022 0.0000factor(time)6 -5.778 0.8885252 -6.50291 0.0000factor(group)P:factor(time)1 11.406 1.1199432 10.18445 0.0000factor(group)P:factor(time)4 8.824 1.1524663 7.65662 0.0000factor(group)P:factor(time)6 3.152 1.2565644 2.50843 0.0125
Correlation: (Intr) fct()P fct()1 fct()4 fct()6 f()P:()1 f()P:()4factor(group)P -0.707 factor(time)1 -0.218 0.154 factor(time)4 -0.191 0.135 0.680 factor(time)6 -0.096 0.068 0.386 0.385 factor(group)P:factor(time)1 0.154 -0.218 -0.707 -0.481 -0.273 factor(group)P:factor(time)4 0.135 -0.191 -0.481 -0.707 -0.272 0.680 factor(group)P:factor(time)6 0.068 -0.096 -0.273 -0.272 -0.707 0.386 0.385
Standardized residuals: Min Q1 Med Q3 Max -2.1756391 -0.6849959 -0.1515546 0.5294172 5.6327402
Residual standard error: 5.0225 Degrees of freedom: 400 total; 392 residual
Longitudinal Data Analysis 138
glsTLC5:
leadij = �0 + �1groupi + �2timej
with the covariance matrix having the compound symmetry form
Cov(Yi) =
�2
1 �12 �13 �14
�21 �22 �24 �24
�31 �32 �23 �34
�41 �42 �43 �24
>gls(lead ∼ factor(group)+factor(time),data=tlc.long,correlation=corSymm(form= 1|id),
weight=varIdent(form= 1|time))
Longitudinal Data Analysis 139
R Console Page 1
> summary(glsTLC5)Generalized least squares fit by REML Model: lead ~ factor(group) + factor(time) Data: tlc.long AIC BIC logLik 2525.171 2584.854 -1247.585
Correlation Structure: General Formula: ~1 | id Parameter estimate(s): Correlation: 1 2 3 2 0.334 3 0.407 0.822 4 0.551 0.512 0.550Variance function: Structure: Different standard deviations per stratum Formula: ~1 | time Parameter estimates: 0 1 4 6 1.000000 1.567427 1.478107 1.484946
Coefficients: Value Std.Error t-value p-value(Intercept) 25.399921 0.7104368 35.75254 0.0000factor(group)P 2.012157 0.9786873 2.05598 0.0404factor(time)1 -7.315000 0.7993320 -9.15139 0.0000factor(time)4 -6.614000 0.7247826 -9.12549 0.0000factor(time)6 -4.202000 0.6448471 -6.51627 0.0000
Correlation: (Intr) fct()P fct()1 fct()4factor(group)P -0.689 factor(time)1 -0.222 0.000 factor(time)4 -0.205 0.000 0.814 factor(time)6 -0.105 0.000 0.437 0.446
Standardized residuals: Min Q1 Med Q3 Max -2.23560001 -0.64349384 -0.04593555 0.61603808 5.58341364
Residual standard error: 5.150371 Degrees of freedom: 400 total; 395 residual
Longitudinal Data Analysis 140
R Console Page 1
> anova(glsTLC,glsTLC2) Model df AIC BIC logLik Test L.Ratio p-valueglsTLC 1 10 2480.621 2520.334 -1230.311 glsTLC2 2 13 2459.960 2511.587 -1216.980 1 vs 2 26.66058 <.0001> > anova(glsTLC,glsTLC3) Model df AIC BIC logLik Test L.Ratio p-valueglsTLC 1 10 2480.621 2520.334 -1230.311 glsTLC3 2 15 2471.632 2531.200 -1220.816 1 vs 2 18.98944 0.0019> > anova(glsTLC,glsTLC4) Model df AIC BIC logLik Test L.Ratio p-valueglsTLC 1 10 2480.621 2520.334 -1230.311 glsTLC4 2 18 2452.076 2523.559 -1208.038 1 vs 2 44.54507 <.0001> > anova(glsTLC2,glsTLC3) Model df AIC BIC logLik Test L.Ratio p-valueglsTLC2 1 13 2459.960 2511.587 -1216.980 glsTLC3 2 15 2471.632 2531.200 -1220.816 1 vs 2 7.671143 0.0216> > anova(glsTLC2,glsTLC4) Model df AIC BIC logLik Test L.Ratio p-valueglsTLC2 1 13 2459.960 2511.587 -1216.980 glsTLC4 2 18 2452.076 2523.559 -1208.038 1 vs 2 17.88450 0.0031> > anova(glsTLC3,glsTLC4) Model df AIC BIC logLik Test L.Ratio p-valueglsTLC3 1 15 2471.632 2531.200 -1220.816 glsTLC4 2 18 2452.076 2523.559 -1208.038 1 vs 2 25.55564 <.0001> > anova(glsTLC4,glsTLC5) Model df AIC BIC logLik Test L.Ratio p-valueglsTLC4 1 18 2452.076 2523.559 -1208.038 glsTLC5 2 15 2525.171 2584.854 -1247.585 1 vs 2 79.09486 <.0001Warning message:In anova.lme(object = glsTLC4, glsTLC5) : Fitted objects with different fixed effects. REML comparisons are not meaningful.> > anova(update(glsTLC4,method='ML'),update(glsTLC5, method='ML')) Model df AIC BIC logLik Test L.Ratio p-valueupdate(glsTLC4, method = "ML") 1 18 2461.368 25 33.214 -1212.684 update(glsTLC5, method = "ML") 2 15 2529.555 25 89.427 -1249.778 1 vs 2 74.18778 <.0001
Longitudinal Data Analysis 141
Random E�ects
• We need to understand (at least qualitatively) what are the likely sources of randomvariation
• One possible source is Random Effects, when units are sampled at random from apopulation and various aspects of their behavior may show stochastic variation betweenunits
• We introduce Linear Random E�ects model where
{ the response is assumed to be a linear function of exploratory variables with regressioncoe�cients that vary from one individual to the next
{ variability re ects natural heterogeneity due to unmeasured factors
Longitudinal Data Analysis 142
Example: Children birth weight and growth rate.
• A random e�ects model is a reasonable description if the set of coe�cients from apopulation of children can be thought of as a sample from a distribution
• Given the actual coe�cient for a children, the linear Random E�ects model assumes thatrepeated observations for that person are independent
• Correlation arises because we cannot observe the underlying growth curve, that is theregression coe�cient, but we have only imperfect measurements of weight on each infant
• So the model takes the form
E(Yij|Ui) = (�0 + Ui) + �1(time)ij
• Typically, a parametric model such as Gaussian with mean=0 and unknown variance �2 isused for Ui.
Longitudinal Data Analysis 143
Linear Mixed Models
• The Usual Linear Modely = X� + e;
where
{ y = (y1; :::; yn)′ is an n× 1 vector of independent observations
{ � is a p× 1 vector of unknown parameters{ X an n× p design (model) matrix{ e = (e1; :::; en)
′ is an n× 1 vector of independent errors
Longitudinal Data Analysis 144
• The linear mixed model (general)
Yi = Xi� + Zibi + ei;
where
{ Yi, � and e as before with∗ E(ei) = 0n∗ V ar(ei) = W
{ Matrix Z is a given n× q matrix (the columns of Z is a subset of the columns of X){ bi is an unobservable random vector of dimensions q × 1, following (theoretically) any
multivariate distribution with the following assumptions∗ E(bi) = 0q∗ V ar(bi) = B
In practice bi follow a multivariate normal distribution.{ In addition, vectors bi and ei are assumed uncorrelated.{ E(Yi) = Xi�
{ V ar(Yi) = V ar(X� + Zb + e) = ZBZ ′ + W .
Longitudinal Data Analysis 145
Random Intercept Model
Consider the model
Yij = X′ij� + bi + eij
= (�1 + bi) + Xij2�2 + ::: + Xijp�p + eij
• Each subject's pro�le appears at (across occasions) - [or parallel]
• Observations Yij vary around a di�erent value for each subject. These values are theintercepts of the line each subject's responses vary around, where bi represents thedeviations of subject's i intercept from the population one (�1).
• The set of intercepts are a sample from the population of intercepts.
• This implies that there is between-subject variability (equivalent to within-subject
correlation)
Longitudinal Data Analysis 146
1 2 3 4 5
−2
−1
01
2
Time
Res
pons
e
Longitudinal Data Analysis 147
• Furthermore, the variance of Yij takes the form
V ar(Yij) = V ar(X′ij� + bi + eij)
= V ar(bi) + V ar(eij)
= �2b + �2
and the covariance between any pair of observations of the same subject
Cov(Yij; Yik) = Cov(X′ij� + bi + eij; X
′ik� + bi + eik)
= Cov(bi; bi)
= �2b :
Longitudinal Data Analysis 148
The covariance matrix then becomes
Cov(Yi) =
�2b + �2 �2
b �2b · · · �2
b
�2b �2
b + �2 �2b · · · �2
b
�2b �2
b �2b + �2 · · · �2
b... ... ... . . . ...�2b �2
b �2b · · · �2
b + �2
;
and the correlation between two observations becomes
� = Corr(Yij; Yik) =�2b
�2b + �2
:
• The presence of random e�ect induce correlation among repeated measurements. This isalso known as intra-class correlation.
Longitudinal Data Analysis 149
Note: In statistics, the intraclass correlation is a descriptive statistic that can be used whenquantitative measurements are made on units that are organized into groups. It describeshow strongly units in the same group resemble each other. While it is viewed as a typeof correlation, unlike most other correlation measures it operates on data structured asgroups, rather than data structured as paired observations.
• The model
E(Yij|bi) = X′ij� + bi
is referred to as the conditional or subject speci�c mean model
• The model
E(Yij) = X′ij�
is referred to as the marginal or population averaged mean model
Longitudinal Data Analysis 150
1 2 3 4 5
02
46
810
Time
Res
pons
e
Longitudinal Data Analysis 151
Example: Orthodont Data [included in nlme package]
• A set of measurements of the distance from the pituitary gland to the pterygomaxillary�ssure taken every 2 years.
• Measurements taken from 8 till 14 years of age.
• We have 27 children: 16 males - 11 females
• Data collected from x-rays.
Longitudinal Data Analysis 152
Age (yr)
Dis
tanc
e fr
om p
ituita
ry to
pte
rygo
max
illar
y fis
sure
(m
m)
20
25
30
810 13
● ●
●
●
M16
●
●●
●
M05
810 13
●● ●
●
M02
● ● ●
●
M11
810 13
● ●
●
●
M07
●
●
●●
M08
810 13
● ●
●
●
M03
●
● ●
●
M12
810 13
●
●
●
●
M13
●
● ● ●
M14
●
●
●
●
M09
●
●
●
●
M15
●●
●
●
M06
●
●● ●
M04
●●
●
●
M01
● ●
● ●
M10
●
● ● ●
F10
20
25
30
●●
● ●
F09
20
25
30
●● ●
●
F06
810 13
●●
●
●
F01
●
● ●●
F05
810 13
●● ●
●
F07
● ●
●
●
F02
810 13
● ● ● ●
F08
●
● ●
●
F03
810 13
●● ●
●
F04
● ●
● ●
F11
Longitudinal Data Analysis 153
R Console Page 1
> levels(Orthodont$Sex)[1] "Male" "Female"> OrthoFem=Orthodont[Orthodont$Sex=="Female",]> lmF=lmList(distance ~ age, data=OrthoFem)> coef(lmF) (Intercept) ageF10 13.55 0.450F09 18.10 0.275F06 17.00 0.375F01 17.25 0.375F05 19.60 0.275F07 16.95 0.550F02 14.20 0.800F08 21.45 0.175F03 14.40 0.850F04 19.65 0.475F11 18.95 0.675
Longitudinal Data Analysis 154
R Console Page 1
> intervals(lmF), , (Intercept)
lower est. upperF10 10.07138 13.55 17.02862F09 14.62138 18.10 21.57862F06 13.52138 17.00 20.47862F01 13.77138 17.25 20.72862F05 16.12138 19.60 23.07862F07 13.47138 16.95 20.42862F02 10.72138 14.20 17.67862F08 17.97138 21.45 24.92862F03 10.92138 14.40 17.87862F04 16.17138 19.65 23.12862F11 15.47138 18.95 22.42862
, , age
lower est. upperF10 0.14009962 0.450 0.7599004F09 -0.03490038 0.275 0.5849004F06 0.06509962 0.375 0.6849004F01 0.06509962 0.375 0.6849004F05 -0.03490038 0.275 0.5849004F07 0.24009962 0.550 0.8599004F02 0.49009962 0.800 1.1099004F08 -0.13490038 0.175 0.4849004F03 0.54009962 0.850 1.1599004F04 0.16509962 0.475 0.7849004F11 0.36509962 0.675 0.9849004
Longitudinal Data Analysis 155
Sub
ject
F10
F09
F06
F01
F05
F07
F02
F08
F03
F04
F11
10 15 20 25
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
(Intercept)
0.0 0.5 1.0
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
age
Longitudinal Data Analysis 156
R Console Page 1
> lmF2=update(lmF,distance~I(age-11))> intervals(lmF2), , (Intercept)
lower est. upperF10 17.80704 18.500 19.19296F09 20.43204 21.125 21.81796F06 20.43204 21.125 21.81796F01 20.68204 21.375 22.06796F05 21.93204 22.625 23.31796F07 22.30704 23.000 23.69296F02 22.30704 23.000 23.69296F08 22.68204 23.375 24.06796F03 23.05704 23.750 24.44296F04 24.18204 24.875 25.56796F11 25.68204 26.375 27.06796
, , I(age - 11)
lower est. upperF10 0.14009962 0.450 0.7599004F09 -0.03490038 0.275 0.5849004F06 0.06509962 0.375 0.6849004F01 0.06509962 0.375 0.6849004F05 -0.03490038 0.275 0.5849004F07 0.24009962 0.550 0.8599004F02 0.49009962 0.800 1.1099004F08 -0.13490038 0.175 0.4849004F03 0.54009962 0.850 1.1599004F04 0.16509962 0.475 0.7849004F11 0.36509962 0.675 0.9849004
Longitudinal Data Analysis 157
Sub
ject
F10
F09
F06
F01
F05
F07
F02
F08
F03
F04
F11
18 20 22 24 26
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
(Intercept)
0.0 0.5 1.0
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
I(age − 11)
Longitudinal Data Analysis 158
R Console Page 1
> lmeF=lme(distance~age,data=OrthoFem,random=~1)# Using REML> summary(lmeF)Linear mixed-effects model fit by REML Data: OrthoFem AIC BIC logLik 149.2183 156.169 -70.60916
Random effects: Formula: ~1 | Subject (Intercept) ResidualStdDev: 2.06847 0.7800331
Fixed effects: distance ~ age Value Std.Error DF t-value p-value(Intercept) 17.372727 0.8587419 32 20.230440 0age 0.479545 0.0525898 32 9.118598 0 Correlation: (Intr)age -0.674
Standardized Within-Group Residuals: Min Q1 Med Q3 Max -2.2736479 -0.7090164 0.1728237 0.4122128 1.6325181
Number of Observations: 44Number of Groups: 11
Longitudinal Data Analysis 159
R Console Page 1
> lmeF0=lme(distance~I(age-11),data=OrthoFem,random=~1)> summary(lmeF0)Linear mixed-effects model fit by REML Data: OrthoFem AIC BIC logLik 149.2183 156.169 -70.60916
Random effects: Formula: ~1 | Subject (Intercept) ResidualStdDev: 2.06847 0.7800331
Fixed effects: distance ~ I(age - 11) Value Std.Error DF t-value p-value(Intercept) 22.647727 0.6346568 32 35.6850 0I(age - 11) 0.479545 0.0525898 32 9.1186 0 Correlation: (Intr)I(age - 11) 0
Standardized Within-Group Residuals: Min Q1 Med Q3 Max -2.2736479 -0.7090164 0.1728237 0.4122128 1.6325181
Number of Observations: 44Number of Groups: 11
Longitudinal Data Analysis 160
Random Intercept and Slope Model
Consider the model
Yij = (�1 + b1i) + (�2 + b2i)tij + eij:
• Each subject varies with respect(i) baseline level when ti1 = 0 and(ii) rate of change of response over time.
• In this particular case we have q = p = 2 and
Xi = Zi =
1 ti11 ti2... ...1 tini
:
Longitudinal Data Analysis 161
• Additionally, consider the variance
V ar(Yij) = V ar(X′ij� + Z
′ijbi + eij)
= V ar(Z′ijbi + eij)
= V ar(b1i + b2itij + eij)
= V ar(b1i) + 2tijCov(b1i; b2i) + t2ijV ar(b2i) + V ar(eij):
and the covariance among the repeated observations of the same subject becomes
Cov(Yij; Yik) = V ar(b1i) + (tij + tik)Cov(b1i; b2i) + tijtikV ar(b2i):
• Hence, the covariance matrix can be expressed as a function of time.
Longitudinal Data Analysis 162
Covariance Structure
In the linear mixed model
Yi = Xi� + Zibi + ei;
the matrix Wi = Cov(ei) introduces the covariance between the repeated observations whenfocusing on the conditional mean response pro�le of a speci�c individual. In other words, itis the covariance of the ith individual's deviations from the response pro�le
E(Yi|bi) = Xi� + Zibi:
• The usual assumption is W = �2In. This is referred as the conditional independence
assumption.
• The conditional covariance becomes
Cov(Yi|bi) = Cov(ei) = Wi
Longitudinal Data Analysis 163
• The marginal then takes the form
Cov(Yi) = ZiBZ′i + Wi
.
• The Cov(Yi) allows for between-subject (B) and within-subject (Wi) sources of variation.
• Due to the fact that Cov(Yi) is a function of times of measurements (when time is in Zi),in principle each subject may have its own measurement times.
• The comparison of random e�ects models for the covariance is based on the likelihoodratio test (REML). A test of two nested models, one with q and another one with q + 1correlated random e�ects lead to a chi-square test on q + 1 df (1 for variance and q
covariances). However, caution is needed when the null hypothesis is on the boundary ofthe parameter space.
Longitudinal Data Analysis 164
Some Characteristics
• There is no need of balanced data.
• The covariances are functions of time. As a result, if time is included in Zi, each patientcan have his own sequence of measurement times. This property makes these modelssuitable for the analysis of real life longitudinal data.
• The number of covariance parameters that need to be estimated remains unchangedregardless of the number of measurements.
• The random e�ects covariance structure allows the variances and covariances to change(increase or decrease) as a function of measurement times, without introducing restrictivestructures as the covariance pattern models do.
Longitudinal Data Analysis 165
Prediction
• In the analysis of longitudinal data the interest in �xed e�ects � is obvious. Theinterpretation of the parameters is clear and associated with the mean response over timeand changes in covariates.
• In many cases, however, subject-speci�c trajectories are of interest.
• Under the linear mixed-e�ects model patient speci�c response trajectories can bepredicted/estimated.
• This is possible by obtaining predictions of the subject-speci�c e�ects bi (random e�ects),or
Xi� + Zibi:
Longitudinal Data Analysis 166
• Generally, the issue of predicting a random variable and as a result the patient speci�cresponse trajectory is that of predicting its conditional mean given the available data.
• There are two pieces of information that contribute in the estimation/prediction of bi.
{ The �rst is the statement thatbi ∼ N(0; B)
(the prior of bi).{ The second is the likelihood of the data Yi, which say that
Yi|bi ∼ N(Xi� + Zibi; Wi)
.
Longitudinal Data Analysis 167
• We combine information by multiplying the two densities (joint) and ...after some maths...we get
E(bi|Yi) = BZ′iΣ−1i (Yi −Xi�);
where Σi = Cov(Yi) = ZiBZ′i + Wi: This is known as the BLUP.
• The predictor of bi depends on B. Hence, when this is replaced by its REML estimator,we have
bi = BZ′iΣ−1i (Yi −Xi�);
also known as the empirical BLUP (or empirical Bayes estimate).
• Given bi we obtainYi = Xi� + Zibi:
Longitudinal Data Analysis 168
• As a result we have
Yi = Xi� + Zibi
= Xi� + ZiBZ′iΣ−1i (Yi −Xi�)
= (Ini − ZiBZ′iΣ−1i )Xi� + ZiBZ
′iΣ−1i Yi
= (WiΣ−1i )Xi� + (Ini − WiΣ
−1i )Yi
whereΣiΣ
−1i = Ini = (ZiBZ
′i + Wi)Σ
−1i = ZiBZ
′iΣ−1i + WiΣ
−1i :
This expression shows that Yi is a weighted mean of Xi� , the population-averaged meanresponse pro�le and Yi the i
th patient's observed response pro�le.
• As a result the predicted response pro�le is pulled (shrinks) towards the population-averaged mean response pro�le.
Longitudinal Data Analysis 169
• The amount of shrinkage depends on Wi and Σi.
• If Wi is "large" then the within-subject variability is greater that the between subjectvariability and hence more weight is given on the population averaged mean responsepro�le Xi�.
• The opposite holds when Wi is "small".
Longitudinal Data Analysis 170
Example: Orthodont (cont.)
>lmeOrth1=lme(distance ∼ I(age-11),data=Orthodont,random=∼1)>lmeOrth1ml=update(lmeOrth1,method='ML')
>lmeOrth2=lme(distance ∼ I(age-11),data=Orthodont)
>lmeOrth2ml=update(lmeOrth2,method='ML')
>lmeOrth3=update(lmeOrth2,fixed=distance ∼ Sex*I(age-11))
Longitudinal Data Analysis 171
R Console Page 1
> summary(lmeOrth1)Linear mixed-effects model fit by REML Data: Orthodont AIC BIC logLik 455.0025 465.6563 -223.5013
Random effects: Formula: ~1 | Subject (Intercept) ResidualStdDev: 2.114724 1.431592
Fixed effects: distance ~ I(age - 11) Value Std.Error DF t-value p-value(Intercept) 24.023148 0.4296605 80 55.91193 0I(age - 11) 0.660185 0.0616059 80 10.71626 0 Correlation: (Intr)I(age - 11) 0
Standardized Within-Group Residuals: Min Q1 Med Q3 Max -3.66453932 -0.53507984 -0.01289591 0.48742859 3.72178465
Number of Observations: 108Number of Groups: 27
Longitudinal Data Analysis 172
R Console Page 1
> OrthRE1ml=random.effects(lmeOrth1ml)> OrthRE1ml (Intercept)M16 -0.9152788M05 -0.9152788M02 -0.5798146M11 -0.3561719M07 -0.2443505M08 -0.1325291M03 0.2029351M12 0.2029351M13 0.2029351M14 0.7620421M09 0.9856849M15 1.6566133M06 2.1038989M04 2.3275416M01 3.3339342M10 4.8994337F10 -4.9408491F09 -2.5925998F06 -2.5925998F01 -2.3689570F05 -1.2507430F07 -0.9152788F02 -0.9152788F08 -0.5798146F03 -0.2443505F04 0.7620421F11 2.1038989
Longitudinal Data Analysis 173
R Console Page 1
> coef(lmeOrth1)#subject specific coefficients (random intercept only) (Intercept) I(age - 11)M16 23.10517 0.6601852M05 23.10517 0.6601852M02 23.44163 0.6601852M11 23.66593 0.6601852M07 23.77808 0.6601852M08 23.89023 0.6601852M03 24.22668 0.6601852M12 24.22668 0.6601852M13 24.22668 0.6601852M14 24.78744 0.6601852M09 25.01174 0.6601852M15 25.68464 0.6601852M06 26.13325 0.6601852M04 26.35755 0.6601852M01 27.36691 0.6601852M10 28.93702 0.6601852F10 19.06774 0.6601852F09 21.42291 0.6601852F06 21.42291 0.6601852F01 21.64721 0.6601852F05 22.76872 0.6601852F07 23.10517 0.6601852F02 23.10517 0.6601852F08 23.44163 0.6601852F03 23.77808 0.6601852F04 24.78744 0.6601852F11 26.13325 0.6601852
Longitudinal Data Analysis 174
R Console Page 1
> summary(lmeOrth1ml)Linear mixed-effects model fit by maximum likelihood Data: Orthodont AIC BIC logLik 451.3895 462.1181 -221.6948
Random effects: Formula: ~1 | Subject (Intercept) ResidualStdDev: 2.072142 1.422728
Fixed effects: distance ~ I(age - 11) Value Std.Error DF t-value p-value(Intercept) 24.023148 0.4255878 80 56.44699 0I(age - 11) 0.660185 0.0617993 80 10.68272 0 Correlation: (Intr)I(age - 11) 0
Standardized Within-Group Residuals: Min Q1 Med Q3 Max -3.68695130 -0.53862941 -0.01232442 0.49100161 3.74701483
Number of Observations: 108Number of Groups: 27
Longitudinal Data Analysis 175
R Console Page 1
> OrthRE1ml=random.effects(lmeOrth1ml)> OrthRE1ml (Intercept)M16 -0.9152788M05 -0.9152788M02 -0.5798146M11 -0.3561719M07 -0.2443505M08 -0.1325291M03 0.2029351M12 0.2029351M13 0.2029351M14 0.7620421M09 0.9856849M15 1.6566133M06 2.1038989M04 2.3275416M01 3.3339342M10 4.8994337F10 -4.9408491F09 -2.5925998F06 -2.5925998F01 -2.3689570F05 -1.2507430F07 -0.9152788F02 -0.9152788F08 -0.5798146F03 -0.2443505F04 0.7620421F11 2.1038989
Longitudinal Data Analysis 176
R Console Page 1
> coef(lmeOrth1ml)#subject specific coefficients (random intercept only) (Intercept) I(age - 11)M16 23.10787 0.6601852M05 23.10787 0.6601852M02 23.44333 0.6601852M11 23.66698 0.6601852M07 23.77880 0.6601852M08 23.89062 0.6601852M03 24.22608 0.6601852M12 24.22608 0.6601852M13 24.22608 0.6601852M14 24.78519 0.6601852M09 25.00883 0.6601852M15 25.67976 0.6601852M06 26.12705 0.6601852M04 26.35069 0.6601852M01 27.35708 0.6601852M10 28.92258 0.6601852F10 19.08230 0.6601852F09 21.43055 0.6601852F06 21.43055 0.6601852F01 21.65419 0.6601852F05 22.77241 0.6601852F07 23.10787 0.6601852F02 23.10787 0.6601852F08 23.44333 0.6601852F03 23.77880 0.6601852F04 24.78519 0.6601852F11 26.12705 0.6601852
Longitudinal Data Analysis 177
>plot(compareFits(coef(lmeOrth1),coef(lmeOrth1ml)))
M16M05M02M11M07M08M03M12M13M14M09M15M06M04M01M10F10F09F06F01F05F07F02F08F03F04F11
20 22 24 26 28
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
(Intercept)
0.2 0.4 0.6 0.8 1.0
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
I(age − 11)
● ●coef(lmeOrth1) coef(lmeOrth1ml)
Longitudinal Data Analysis 178
>plot(augPred(lmeOrth1),aspect="xy",grid=T)
Age (yr)
Dis
tanc
e fr
om p
ituita
ry to
pte
rygo
max
illar
y fis
sure
(m
m)
20
25
30
8 1114
● ●
●
●
M16
●
●●
●
M05
8 1114
●● ●
●
M02
● ● ●
●
M11
8 1114
● ●
●
●
M07
●
●
●●
M08
8 1114
● ●
●
●
M03
●
● ●
●
M12
8 1114
●
●
●
●
M13
●
● ● ●
M14
●
●
●
●
M09
●
●
●
●
M15
●●
●
●
M06
●
●● ●
M04
●●
●
●
M01
● ●
● ●
M10
●
● ● ●
F10
20
25
30
●●
● ●
F09
20
25
30
●● ●
●
F06
8 1114
●●
●
●
F01
●
● ●●
F05
8 1114
●● ●
●
F07
● ●
●
●
F02
8 1114
● ● ● ●
F08
●
● ●
●
F03
8 1114
●● ●
●
F04
● ●
● ●
F11
Longitudinal Data Analysis 179
R Console Page 1
> summary(lmeOrth2)Linear mixed-effects model fit by REML Data: Orthodont AIC BIC logLik 454.6367 470.6173 -221.3183
Random effects: Formula: ~I(age - 11) | Subject Structure: General positive-definite StdDev Corr (Intercept) 2.1343327 (Intr)I(age - 11) 0.2264275 0.503 Residual 1.3100394
Fixed effects: distance ~ I(age - 11) Value Std.Error DF t-value p-value(Intercept) 24.023148 0.4296608 80 55.91189 0I(age - 11) 0.660185 0.0712532 80 9.26534 0 Correlation: (Intr)I(age - 11) 0.294
Standardized Within-Group Residuals: Min Q1 Med Q3 Max -3.223106405 -0.493761198 0.007316808 0.472151143 3.916034231
Number of Observations: 108Number of Groups: 27
Longitudinal Data Analysis 180
>plot(compareFits(ranef(lmeOrth2),ranef(lmeOrth2ml)),mark=c(0,0))
M16M05M02M11M07M08M03M12M13M14M09M15M06M04M01M10F10F09F06F01F05F07F02F08F03F04F11
−4 −2 0 2 4
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
(Intercept)
−0.2 0.0 0.2 0.4
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
I(age − 11)
● ●ranef(lmeOrth2) ranef(lmeOrth2ml)
Longitudinal Data Analysis 181
R Console Page 1
> summary(lmeOrth3)Linear mixed-effects model fit by REML Data: Orthodont AIC BIC logLik 458.9891 498.655 -214.4945
Random effects: Formula: ~Sex + I(age - 11) + Sex:I(age - 11) | Subject Structure: General positive-definite StdDev Corr (Intercept) 1.7178454 (Intr) SexFml I(-11)SexFemale 1.6956351 -0.307 I(age - 11) 0.2937695 -0.009 -0.146 SexFemale:I(age - 11) 0.3160597 0.168 0.290 -0.964Residual 1.2551778
Fixed effects: distance ~ Sex + I(age - 11) + Sex:I(age - 11) Value Std.Error DF t-value p-value(Intercept) 24.968750 0.4572240 79 54.60945 0.0000SexFemale -2.321023 0.7823126 25 -2.96687 0.0065I(age - 11) 0.784375 0.1015733 79 7.72226 0.0000SexFemale:I(age - 11) -0.304830 0.1346293 79 -2.26421 0.0263 Correlation: (Intr) SexFml I(-11)SexFemale -0.584 I(age - 11) -0.006 0.004 SexFemale:I(age - 11) 0.005 0.144 -0.754
Standardized Within-Group Residuals: Min Q1 Med Q3 Max -2.96534486 -0.38609670 0.03647795 0.43142668 3.99155835
Number of Observations: 108Number of Groups: 27
Longitudinal Data Analysis 182
R Console Page 1
> OrthRE3=random.effects(lmeOrth3)> OrthRE3 (Intercept) SexFemale I(age - 11) SexFemale:I(age - 11)M16 -1.73612668 0.63199885 -0.121203414 0.0748642681M05 -1.73713471 0.49796730 0.035630448 -0.0876368146M02 -1.40604191 0.43103963 -0.003830025 -0.0370896958M11 -1.18396932 0.56512991 -0.239248823 0.2132764937M07 -1.07528511 0.31943477 0.008987456 -0.0407096045M08 -0.96357680 0.47583428 -0.213277852 0.1928075453M03 -0.63399603 0.20785928 -0.017487532 -0.0003969599M12 -0.63483606 0.09616632 0.113207353 -0.1358145288M13 -0.63802816 -0.32826691 0.609847916 -0.6504012907M14 -0.08183867 0.14099033 -0.135532941 0.1380152657M09 0.13720981 -0.12701403 0.099549847 -0.0991217929M15 0.79838740 -0.39490093 0.177462762 -0.1605286380M06 1.24102052 -0.32776769 -0.058124041 0.0964521169M04 1.46326110 -0.17133882 -0.319681817 0.3739018202M01 2.45317943 -0.81889368 0.084716304 -0.0161270990M10 3.99777519 -1.19823860 -0.021015641 0.1385089141F10 -1.91258504 -1.84210386 0.071770763 -0.2293495874F09 -0.72087067 -0.69430276 0.027050737 -0.0864435068F06 -0.71120815 -0.68499782 0.026688309 -0.0852850411F01 -0.59610113 -0.57413261 0.022368854 -0.0714818606F05 -0.03022851 -0.02911148 0.001134008 -0.0036244236F07 0.16900395 0.16277491 -0.006341852 0.0202661280F02 0.19316023 0.18603726 -0.007247922 0.0231622923F08 0.30543005 0.29417922 -0.011461928 0.0366266523F03 0.54331257 0.52328536 -0.020387500 0.0651510668F04 1.02505976 0.98728531 -0.038465941 0.1229211327F11 1.73502694 1.67108646 -0.065107527 0.2080571474
Longitudinal Data Analysis 183
>plot(augPred(lmeOrth3),aspect="xy",grid=T)
Age (yr)
Dis
tanc
e fr
om p
ituita
ry to
pte
rygo
max
illar
y fis
sure
(m
m)
20
25
30
8 11
● ●
●
●
M16
●
●●
●
M05
8 11
●● ●
●
M02
● ● ●
●
M11
8 11
● ●
●
●
M07
●
●
●●
M08
8 11
● ●
●
●
M03
●
● ●
●
M12
8 11
●
●
●
●
M13
●
● ● ●
M14
●
●
●
●
M09
●
●
●
●
M15
●●
●
●
M06
●
●● ●
M04
●●
●
●
M01
● ●
● ●
M10
●
● ● ●
F10
20
25
30
●●
● ●
F09
20
25
30
●● ●
●
F06
8 11
●●
●
●
F01
●
● ●●
F05
8 11
●● ●
●
F07
● ●
●
●
F02
8 11
● ● ● ●
F08
●
● ●
●
F03
8 11
●● ●
●
F04
● ●
● ●
F11
Longitudinal Data Analysis 184
R Console Page 1
> newOrth=data.frame(Subject=rep(c("M11","F03"),c(3,3)),Sex=rep(c("Male","Female"),c(3,3)),age=rep(16:18,2) )> newOrth Subject Sex age1 M11 Male 162 M11 Male 173 M11 Male 184 F03 Female 165 F03 Female 176 F03 Female 18> predict(lmeOrth3,newdata=newOrth,level=0:1) Subject predict.fixed predict.Subject1 M11 28.89063 26.510412 M11 29.67500 27.055543 M11 30.45938 27.600664 F03 25.04545 26.335875 F03 25.52500 26.860186 F03 26.00455 27.38449
Longitudinal Data Analysis 185
>lmListOrth=lmList(distance I(age-11), data=Orthodont)
>compFOrth=compareFits(coef(lmListOrth),coef(lmeOrth2))
M16M05M02M11M07M08M03M12M13M14M09M15M06M04M01M10F10F09F06F01F05F07F02F08F03F04F11
18 20 22 24 26 28 30
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
(Intercept)
0.5 1.0 1.5 2.0
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
I(age − 11)
● ●coef(lmListOrth) coef(lmeOrth2)
Longitudinal Data Analysis 186
>plot(comparePred(lmListOrth,lmeOrth2,length.out=2),layout=c(9,3))
Age (yr)
Dis
tanc
e fr
om p
ituita
ry to
pte
rygo
max
illar
y fis
sure
(m
m)
20
25
30
8 10 13
● ●
●●
M16
●
●●
●
M05
8 10 13
●● ●
●
M02
● ● ●●
M11
8 10 13
● ●
●
●
M07
●
●
●●
M08
8 10 13
● ●
●
●
M03
●
● ●
●
M12
8 10 13
●
●
●
●
M13
●
● ● ●
M14
●
●
●
●
M09
●●
●
●
M15
●●
●
●
M06
●
●● ●
M04
●●
●
●
M01
● ●
● ●
M10
●
● ● ●
F10
20
25
30
●●
● ●
F09
20
25
30
●● ●
●
F06
8 10 13
●●
●●
F01
●● ●
●
F05
8 10 13
●● ●
●
F07
● ●
●●
F02
8 10 13
● ● ● ●
F08
●
● ●
●
F03
8 10 13
●● ●
●
F04
● ●
● ●
F11
lmListOrth lmeOrth2
Longitudinal Data Analysis 187
Examining a Fitted Model
There are two basic assumptions that need to be assessed
1. the within-group errors are assumed independent and identically normally distributed withmean zero and variance �2 (since Wi = �2I), and they are independent of the randome�ects
2. the random e�ects are normally distributed with mean zero and covariance matrix B (notdepending on the group) and are independent for di�erent groups.
Longitudinal Data Analysis 188
Assessing assumptions on the within-group error
• The primary quantities used to assess the adequacy of the �rst assumption are the within-group residuals, de�ned as the di�erence between the observed and the within-group �ttedvalue.
• The plot method of lme class is the primary tool for obtaining diagnostics for the �rstassumption.
Longitudinal Data Analysis 189
Example: Orthodont (cont.)
• Initially we consider the box plot of the residuals, by group.
• We add a vertical line at zero so we can assess whether
{ the residuals are centered at zero{ have constant variance across groups{ are independent of the group level
Longitudinal Data Analysis 190
>plot(lmeOrth2,Subject ∼ resid(.),abline=0)
Residuals (mm)
Sub
ject
M16M05M02M11M07M08M03M12M13M14M09M15M06M04M01M10F10F09F06F01F05F07F02F08F03F04F11
−4 −2 0 2 4
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
Longitudinal Data Analysis 191
>plot(lmeOrth2,resid(.,type='p') ∼ fitted(.)|Sex,id=0.05,adj=-0.3)
Fitted values (mm)
Sta
ndar
dize
d re
sidu
als
−2
0
2
4
20 25 30
●
●
●●
●
●
●
●●
● ●
●●
●
●
●●
●
●
●●
● ● ●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
M09
M09
M13
Male
20 25 30
●
●
●
●●
●
●●
●
●
● ●
●●
●●
●
●
● ●
● ●
●
●
●●
●
●
●
●
●
●
● ● ●
●●
●
●●
●
●
●
●
Female
Longitudinal Data Analysis 192
R Console Page 1
> lmeOrth5=lme(distance~I(age-11),data=Orthodont,weights=varIdent(form=~1|Sex))> summary(lmeOrth5)Linear mixed-effects model fit by REML Data: Orthodont AIC BIC logLik 435.6466 454.2907 -210.8233
Random effects: Formula: ~I(age - 11) | Subject Structure: General positive-definite StdDev Corr (Intercept) 2.1590091 (Intr)I(age - 11) 0.1980627 0.617 Residual 1.6452598
Variance function: Structure: Different standard deviations per stratum Formula: ~1 | Sex Parameter estimates: Male Female 1.0000000 0.4040981 Fixed effects: distance ~ I(age - 11) Value Std.Error DF t-value p-value(Intercept) 23.97377 0.4341697 80 55.21752 0I(age - 11) 0.60686 0.0594260 80 10.21203 0 Correlation: (Intr)I(age - 11) 0.391
Standardized Within-Group Residuals: Min Q1 Med Q3 Max -3.02779067 -0.48052007 0.04214476 0.51813201 3.18632228
Number of Observations: 108Number of Groups: 27
Longitudinal Data Analysis 193
R Console Page 1
> anova(lmeOrth2,lmeOrth5) Model df AIC BIC logLik Test L.Ratio p-valuelmeOrth2 1 6 454.6367 470.6173 -221.3183 lmeOrth5 2 7 435.6466 454.2907 -210.8233 1 vs 2 20.99004 <.0001
Longitudinal Data Analysis 194
>plot(lmeOrth5,resid(.,type='p') ∼ fitted(.)|Sex,id=0.05,adj=-0.3)
Fitted values (mm)
Sta
ndar
dize
d re
sidu
als
−2
0
2
20 25 30
●
●
●
●
●●
●
●
●
●●
●●
●
●
●●
●
●
●
●
● ● ●●
●
●
●
●
●
● ●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●● ● ●
●
●
●
●
●
M09
M09
M13
Male
20 25 30
●
●
●
●
●
●
●●
●
●
● ●
●●
●
●●
●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
Female
Longitudinal Data Analysis 195
>plot(lmeOrth5,distance ∼ fitted(.),id=0.05,adj=-0.3)
Fitted values (mm)
Dis
tanc
e fr
om p
ituita
ry to
pte
rygo
max
illar
y fis
sure
(m
m)
20
25
30
20 25 30
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
● ●
●
●
●
● ●
M09
M09
M13
Longitudinal Data Analysis 196
>qqnorm(lmeOrth5, ∼ resid(.)|Sex)
Residuals (mm)
Qua
ntile
s of
sta
ndar
d no
rmal
−2
−1
0
1
2
−4 −2 0 2 4
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
●
●
●
●
●
Male
−4 −2 0 2 4
●
●
●
●
●
●
●●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
Female
Longitudinal Data Analysis 197
Assessing assumptions on the random e�ects
• The ranef method is used to obtain the estimated BLUP of the random e�ects for lmeobjects.
• Two types of diagnostic plots will be used to assess the second assumption
{ qqnorm: normal plot{ pairs: scatter plot
Longitudinal Data Analysis 198
>qqnorm(lmeOrth2, ∼ ranef(.),id=0.10,cex=0.7)
Random effects
Qua
ntile
s of
sta
ndar
d no
rmal
−2
−1
0
1
2
−4 −2 0 2 4
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
M10
F10
(Intercept)
−0.2 0.0 0.2 0.4
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
M13
I(age − 11)
Longitudinal Data Analysis 199
>pairs(lmeOrth2,∼ranef(.)|Sex,id= ∼ Subject=='M13',adj=-0.3)
(Intercept)
I(ag
e −
11)
−0.2
0.0
0.2
0.4
−4 −2 0 2 4
●
●●
●
●
●
●
●
●
●
●
●
●
●
● ●
M13
Male
−4 −2 0 2 4
●
●
●●●
●
●
●
●
●
●
Female
Longitudinal Data Analysis 200
>qqnorm(lmeOrth5, ∼ ranef(.),id=0.10,cex=0.7)
Random effects
Qua
ntile
s of
sta
ndar
d no
rmal
−2
−1
0
1
2
−4 −2 0 2 4
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
M10
F10
(Intercept)
−0.2 −0.1 0.0 0.1 0.2
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
I(age − 11)
Longitudinal Data Analysis 201
Revision: Generalized Liner Models
• So far we have discussed methods for analyzing continuous data
• When the response is discrete (e.g. binary, count), linear models are no longer appropriate
• Instead, we use Generalized Liner Models (GLM)
• Extensions of GLMs will be considered for the analysis of Longitudinal data
Longitudinal Data Analysis 202
Features:
1. We have a response variable Yi for the ith subject, i = 1; :::; N , with an associated p× 1
vector of covariates
Xi =
Xi1...
Xip
2. Distributional assumption:
In the linear models, the distribution of the response variable is assumed normal. In theGLM, an extension is considered by assuming that the distribution of the response variablebelongs to the exponential family of distributions
f(yi; �i; �) = exp[{yi�i − �(�i)}=� + b(yi; �)]:
The speci�c functions a() and b() distinguish one member of the family from the other.Parameter �i is called the location parameter and � is the dispersion parameter.
Longitudinal Data Analysis 203
It can be shown thatV ar(Yi) = �v(�i);
where v(�i) is the variance function, a known function of the mean �i, and � > 0.Members of this family are the Normal, Bernulli and Poisson distribution.
3. Systematic Component:In GLM, the mean is a function of the linear predictor �i;
�i = �1Xi1 + �2Xi2 + : : : + �pXip;
where usually Xi1 = 1.Note: In this context, linear means that �i is linear to the regression parameters � but notnecessarily to the covariates.
Longitudinal Data Analysis 204
4. Link Function:The �nal thing is to relate the mean �i to linear predictor �i. This can be done byintroducing the link function g(),
g(�i) = �i = �1Xi1 + �2Xi2 + : : : + �pXip:
The link function is a known function, e.g. log(�i), that transforms the mean to changelinearly with changes in the covariates.
Distribution v(�) Link Function
Normal 1 Identity: � = �
Bernoulli �(1− �) Logit: log(
�1−�
)= �
Poisson � Log: log(�) = �
Longitudinal Data Analysis 205
Logistic Regression: Binary outcomes
• Response Yi is a binary outcome with P (Yi = 1) = �i
• The mean is related to the covariates through
logit(�i) = log
(�i
1− �i
)= �1 + �2Xi
• Responses are Bernoulli variables with
V ar(Yi) = �i(1− �i)
• It can also be expressed as
�i =exp(�1 + �2Xi)
1 + exp(�1 + �2Xi)
Longitudinal Data Analysis 206
Log-Linear Model for Count data
• Response Yi is a count assuming that has a Poisson distribution
P (Yi = yi) = e�i�yii
yi!:
• The mean is related to the covariates through
log(�i) = �1 + �2Xi:
• If the rate of occurrence is of interest, we get
log(�i=Ti) = �1 + �2Xi ⇒log(�i) = log(Ti) + �1 + �2Xi;
Longitudinal Data Analysis 207
where Ti is the relevant time period. Ti is known as an o�set, and enters the model witha �xed parameters equal to 1.
• Responses are Poisson variables with
V ar(Yi) = v(�i) = �i:
Longitudinal Data Analysis 208
Classes of model for dependent non-normal data
In the current settings with longitudinal data, two classes of models are widely used
1. marginal or population average (PA) models
2. subject-speci�c (SS) models
Longitudinal Data Analysis 209
i. Marginal or Population Average Models (PA)
• Consider the logistic model
logit(E[Yij]) = X′ij�1
E[Yij] =exp(X
′ij�1)
1 + exp(X′ij�1)
• Also called population-average model• Models the mean at each time• Changes represent changes at the average level, not within subject change• Does not induce any within subject dependence
Longitudinal Data Analysis 210
ii. Subject-Speci�c Models (SS)
• Consider the logistic model
logit(E[Yij|bi]) = X′ij�2 + bi
E[Yij|bi] =exp(X
′ij�2 + bi)
1 + exp(X′ij�2 + bi)
• bi is the e�ect associated with subject i• repeated measurements are assumed independent conditional on bi• Taking the expectation with respect bi induces correlation among repeated measures
and de�nes the marginal expectation
E[Yij] = ES{E[Yij|bi]}
= ES
{exp(X
′ij�2 + bi)
1 + exp(X′ij�2 + bi)
}
Longitudinal Data Analysis 211
To summarize:
• with dependent data, when we move away from the normal linear model, we no longerhave a uni�ed modeling framework• deferent approaches are de�ned for di�erent distributions• as a result, parameter represent di�erent things in di�erent models• extra care is needed about the scale
Longitudinal Data Analysis 212
Comparison between PA and SS models
• This is a matter of scale
{ eg. in logistic regression for the SS model the linear predictor
logit(E[Yij|bi]) = X′ij�2 + bi
operate on the logit scale
Longitudinal Data Analysis 213
{ but, marginalizing involves averaging on the probability scale
E[Yij] = ES{E[Yij|bi]}
= ES
{exp(X
′ij�2 + bi)
1 + exp(X′ij�2 + bi)
}
6=exp(X
′ij�2 + E[bi])
1 + exp(X′ij�2 + E[bi])
=exp(X
′ij�2)
1 + exp(X′ij�2)
• The �nal expression is the probability for a subject with zero subject e�ect and is not thesame thing as the average probability over the subjects
Longitudinal Data Analysis 214
Marginal Models: Generalized Estimating Equations (GEE)
• marginal models are primarily used to provide inferences about the population means
• GEEs provide an extension to the GLMs to longitudinal data
• no distributional assumption for the response variable is required
• only the speci�cation of a regression model for the mean is required
• the response variable can be continuous, binary or count
• furthermore, as a regression model easily handles unbalanced data
Longitudinal Data Analysis 215
Notation:
• the notation is similar to what we have already introduced
• the response variable
Yi =
Yi1Yi2...
Yini
doesn't have to be continuous any more
• ni is the number of observations for subject i
Longitudinal Data Analysis 216
• associated with Yi is a vector of covariates
Xij =
Xij1
Xij2...
Xijp
where i = 1; 2; :::; N and j = 1; 2; :::; ni.
• Two types of covariates are included among Xij
1. between-subject covariates, which are covariates that do not change over time (gender,treatment, etc)
2. within subject covariates, which are those that change over time (time since baseline,current status, etc)
Longitudinal Data Analysis 217
Note: Since marginal models primarily care for population means, marginal models forlongitudinal data model separately the mean response and the within subject associationbetween the repeated responses.
• the former if of interest
• the latter is treated as nuisance
Longitudinal Data Analysis 218
A marginal model has the following three part speci�cation
1. The mean structure is the following
g(�ij) = �ij = X′ij�;
where the conditional mean �ij = E[Yij|Xij] depends on the linear predictor �ij throughthe link function g().
2. The variance is assume to have the form
V ar(Yij) = �v(�ij);
where v(�ij) is a known function of the mean and � is a scale parameter that may be knownor need to be estimated. The scale parameter could be di�erent for di�erent occasions(balanced data) or could depend on time.
Longitudinal Data Analysis 219
3. The within subject association among the repeated responses, given Xij, is a function ofa separate set of parameters, say �, that could also depend on the means. This could bethe pairwise correlations or log-odds ratios, depending on the type of the data
furthermore:
1. in marginal models, the mean response and the within-subject association is modeledseparately
2. the avoidance of distributional assumption for Yij leads to a method of estimation knownas Generalized Estimation Equations (GEE)
Longitudinal Data Analysis 220
e.g. Marginal Model for Continuous Response
• The mean of Yij associates with the covariates through the identity link
�ij = �ij = X′ij�:
• The variance has the formV ar(Yij) = �v(�ij) = �;
where v(�ij) = 1 and � to be estimated.
• The within-subject association among repeated measures can be models using any of theways of modeling the covariance structure already discussed (autoregressive, unstructured,etc). We can assume a �rst order autoregressive correlation structure
Corr(Yij; Yik) = a|k−j|;
where 0 ≤ a ≤ 1.
Longitudinal Data Analysis 221
• The already discussed linear model can be seen as a special case of the marginal model
• The marginal model provide a broad class of models for continuous data, largely based onthe choice of the link function
Longitudinal Data Analysis 222
e.g. Marginal Model for Binary Response
• Responses are considered Bernoulli variables
• The mean of Yij associates with the covariates through the logit link
log
(�ij
1− �ij
)= �ij = X
′ij�:
• The variance has the formV ar(Yij) = �ij(1− �ij);
where � = 1.
Longitudinal Data Analysis 223
• The within-subject association among repeated measures can be models using anunstructured pairwise log-odds ratio pattern (or any other available pattern)
logOR(Yij; Yik) = ajk;
where
OR(Yij; Yik) =P (Yj = 1; Yk = 1)P (Yj = 0; Yk = 0)
P (Yj = 1; Yk = 0)P (Yj = 0; Yk = 1).
Longitudinal Data Analysis 224
e.g. Marginal Model for Counts
• Responses are considered to follow Poisson Distribution
• The mean of Yij associates with the covariates through the log link function
log (�ij) = �ij = X′ij�:
• The variance has the formV ar(Yij) = ��ij;
where � does not depend on time and has to be estimated.
• The within-subject association among repeated measures can be models using anunstructured pairwise correlation pattern (or any other available pattern)
Corr(Yij; Yik) = ajk:
Longitudinal Data Analysis 225
Here a balanced design has been assumed.
• In the model speci�cation, the Poisson variance is multiplied by a parameter �. Hence,variance is in ated when � > 1. It is very common that count data have variability greaterthan the predicted variance from Poisson, and this is called overdispersion.
Longitudinal Data Analysis 226
Estimation
• GEE approach is based on estimating equations
• The idea is to extend the usual likelihood equation for GLM by incorporating the covariancematrix of the responses
• Assume the following marginal model
1. g (�ij) = �ij = X′ij�:
2. V ar(Yij) = �v(�ij);where v(�ij) is a known function of the mean and � can be di�erent for each occasion(balanced data) or depend on time.
3. The pairwise within subject association is assume a function of the means �ij and a setof association parameters �, such that
Vi = A12i Corr(Yi)A
12i ;
Longitudinal Data Analysis 227
where A12i is a diagonal matrix with elements V ar(Yij) = �v(�ij) along the diagonal
and Corr(Yi) is a correlation matrix, a function of �. We tend to call Vi a working
covariance matrix, to distinguish it from the true underline covariance matrix.
• The GLS estimator of � is
� =
{N∑i=1
(X′iΣ−1i Xi)
}−1 N∑i=1
(X′iΣ−1i yi);
obtained by solvingN∑i=1
X′iΣ−1i (yi − �i) = 0
as part of the minimization of
N∑i=1
(yi −Xi�)′Σ−1i (yi −Xi�):
Longitudinal Data Analysis 228
• The GEE estimator of � is obtained from the the minimization of
N∑i=1
(yi − �i(�))′V −1i (yi − �i(�))
with respect to �, where Vi is assumed known (ignoring its dependence on �) and �i isthe vector of �ij = g−1(Xij�). This results to the generalized estimating equations
N∑i=1
D′iV−1i (yi − �i) = 0;
where Vi is the working covariance matrix and Di = @�i@�.
Longitudinal Data Analysis 229
Iterative estimation procedure
The GEE have no closed-form solution.
Step 1: Given current (initial) estimated for � and �, Vi is estimated and an estimate of �is obtained from
N∑i=1
D′iV−1i (yi − �i) = 0:
Step 2: Given the current estimate of �, estimates of � and � can be obtained fromstandardized residuals
eij =Yij − �ij√v(�ij)
:
Longitudinal Data Analysis 230
Notes:
1. We iterate between the above steps until convergence.
2. Initial values for � can be obtained from �tting a GLM assuming independent observations
3. Algorithm is simple
Longitudinal Data Analysis 231
Properties:
1. The estimate � is a consistent estimate of � (large sample property). This is trueirrespectively of the choice of Vi. Hence, all we need is that the model for the meanis correctly speci�ed.
2. In large sample, � has a MVN with mean � and
Cov(�) = B−1MB−1;
where
B =
N∑i=1
D′iV−1i Di;
M =
N∑i=1
D′iV−1i Cov(Yi)V
−1i Di:
Longitudinal Data Analysis 232
These matrices can be estimated by substituting �, � and � by their estimates and replacingCov(Yi) = Σi by
(Yi − �i)(Yi − �i)′:
3. Hence
Cov(�) =
(N∑i=1
D′iV−1i Di
)−1{N∑i=1
D′iV−1i (Yi − �i)(Yi − �i)
′V−1i Di
}(N∑i=1
D′iV−1i Di
)−1
:
This is the so called sandwich estimator.
4. Finally, if we model correctly, Vi = Σi and
Cov(�) = B−1:
Longitudinal Data Analysis 233
Pros:
1. The GEE estimator � is as precise as the MLE.
2. The GEE estimator is consistent estimate of � even when the within-subject associationsare misspeci�ed.
3. In this case, valid* estimates of the standard errors can be obtained from the sandwichestimator
* Reliance on the sandwich estimator is not appealing when the number of subjects is notvery big compared to the number of repeated observations, when the design is unbalanced.In these cases, it is preferably to obtain the model based covariance
Cov(�) = B−1;
which provides valid estimates when the working covariance matrix is a good approximationof the true covariance Σi.
Longitudinal Data Analysis 234
Example: Respiratory Data (Binary)
• In each of two centers patients were randomized to active treatment or placebo
• During treatment, the respiratory status (poor or good) was determined at each of fourmonthly visits
• There were 111 patiens (54 vs 57)
• Question of interest is to asses the treatment is e�ective and estimate its e�ect
Longitudinal Data Analysis 235
resp glm = glm(status ∼ centre + treatment + sex + baseline + age, data
= resp, family = "binomial")
Longitudinal Data Analysis 236
R Console Page 1
> summary(resp_glm)
Call:glm(formula = status ~ centre + treatment + sex + b aseline + age, family = "binomial", data = resp)
Deviance Residuals: Min 1Q Median 3Q Max -2.3146 -0.8551 0.4336 0.8953 1.9246
Coefficients: Estimate Std. Error z value Pr( >|z|) (Intercept) -0.900171 0.337653 -2.666 0. 00768 ** centre2 0.671601 0.239567 2.803 0. 00506 ** treatmenttreatment 1.299216 0.236841 5.486 4.1 2e-08 ***sexmale 0.119244 0.294671 0.405 0. 68572 baselinegood 1.882029 0.241290 7.800 6.2 0e-15 ***age -0.018166 0.008864 -2.049 0. 04043 * ---Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘. ’ 0.1 ‘ ’ 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 608.93 on 443 degrees of freed omResidual deviance: 483.22 on 438 degrees of freed omAIC: 495.22
Number of Fisher Scoring iterations: 4
Longitudinal Data Analysis 237
resp gee1 = gee(nstat ∼ centre + treatment + sex + baseline + age, data
= resp, family = "binomial", id = subject,corstr = "independence", scale.fix
= TRUE, scale.value = 1)
Longitudinal Data Analysis 238
R Console Page 1
> summary(resp_gee1)
GEE: GENERALIZED LINEAR MODELS FOR DEPENDENT DATA gee S-function, version 4.13 modified 98/01/27 (1998)
Model: Link: Logit Variance to Mean Relation: Binomial Correlation Structure: Independent
Call:gee(formula = nstat ~ centre + treatment + sex + baseline + age, id = subject, data = resp, family = "binomial", corstr = "independence", scale.fix = TRUE, scale.value = 1)
Summary of Residuals: Min 1Q Median 3Q Max -0.93134415 -0.30623174 0.08973552 0.33018952 0.84307712
Coefficients: Estimate Naive S.E. Naive z Robust S.E.(Intercept) -0.90017133 0.337653052 -2.665965 0.46032700centre2 0.67160098 0.239566599 2.803400 0.35681913treatmenttreatment 1.29921589 0.236841017 5.485603 0.35077797sexmale 0.11924365 0.294671045 0.404667 0.44320235baselinegood 1.88202860 0.241290221 7.799854 0.35005152age -0.01816588 0.008864403 -2.049306 0.01300426 Robust z(Intercept) -1.9555041centre2 1.8821889treatmenttreatment 3.7038127sexmale 0.2690501baselinegood 5.3764332age -1.3969169
Estimated Scale Parameter: 1Number of Iterations: 1
Working Correlation [,1] [,2] [,3] [,4][1,] 1 0 0 0[2,] 0 1 0 0[3,] 0 0 1 0[4,] 0 0 0 1
Longitudinal Data Analysis 239
resp gee2 = gee(nstat ∼ centre + treatment + sex + baseline + age, data
= resp, family = "binomial", id = subject,corstr = "exchangeable", scale.fix
= TRUE, scale.value = 1)
Longitudinal Data Analysis 240
R Console Page 1
> summary(resp_gee2)
GEE: GENERALIZED LINEAR MODELS FOR DEPENDENT DATA gee S-function, version 4.13 modified 98/01/27 (1998)
Model: Link: Logit Variance to Mean Relation: Binomial Correlation Structure: Exchangeable
Call:gee(formula = nstat ~ centre + treatment + sex + baseline + age, id = subject, data = resp, family = "binomial", corstr = "exchangeable", scale.fix = TRUE, scale.value = 1)
Summary of Residuals: Min 1Q Median 3Q Max -0.93134415 -0.30623174 0.08973552 0.33018952 0.84307712
Coefficients: Estimate Naive S.E. Naive z Robust S.E.(Intercept) -0.90017133 0.47846344 -1.8813796 0.46032700centre2 0.67160098 0.33947230 1.9783676 0.35681913treatmenttreatment 1.29921589 0.33561008 3.8712064 0.35077797sexmale 0.11924365 0.41755678 0.2855747 0.44320235baselinegood 1.88202860 0.34191472 5.5043802 0.35005152age -0.01816588 0.01256110 -1.4462014 0.01300426 Robust z(Intercept) -1.9555041centre2 1.8821889treatmenttreatment 3.7038127sexmale 0.2690501baselinegood 5.3764332age -1.3969169
Estimated Scale Parameter: 1Number of Iterations: 1
Working Correlation [,1] [,2] [,3] [,4][1,] 1.0000000 0.3359883 0.3359883 0.3359883[2,] 0.3359883 1.0000000 0.3359883 0.3359883[3,] 0.3359883 0.3359883 1.0000000 0.3359883[4,] 0.3359883 0.3359883 0.3359883 1.0000000
Longitudinal Data Analysis 241
R Console Page 1
> # Confidence Interval for estimated treatment effect [logOR scale]> se <- summary(resp_gee2)$coefficients["treatmenttreatment","Robust S.E."]> coef(resp_gee2)["treatmenttreatment"] + c(-1, 1) * se * qnorm(0.975)[1] 0.6117037 1.9867281> > # Confidence Interval for estimated treatment effect [OR scale]> exp(coef(resp_gee2)["treatmenttreatment"] + c(-1, 1) * se * qnorm(0.975))[1] 1.843570 7.291637
Longitudinal Data Analysis 242
Example: Epilepsy Data (Counts)
• 59 patients with epilepsy were randomized to receive either "Progabide" or "Placebo".
• Numbers of seizures observed in each of four 2-week periods were recorded along with thebaseline seizure count for the 8 weeks prior randomization
• Question of interest is whether taking the anti-epileptic drug reduces the number of seizurescompares to placebo
Longitudinal Data Analysis 243
R Console Page 1
> data("epilepsy", package = "HSAUR")> itp <- interaction(epilepsy$treatment, epilepsy$period)> tapply(epilepsy$seizure.rate, itp, mean) placebo.1 Progabide.1 placebo.2 Progabide.2 placebo.3 Progabide.3 placebo.4 Progabide.4 9.357143 8.580645 8.285714 8.419355 8.785714 8.129032 7.964286 6.709677 > tapply(epilepsy$seizure.rate, itp, var) placebo.1 Progabide.1 placebo.2 Progabide.2 placebo.3 Progabide.3 placebo.4 Progabide.4 102.75661 332.71828 66.65608 140.65161 215.28571 193.04946 58.18386 126.87957
Longitudinal Data Analysis 244
●●
●
●
●
●
●
●
●
1 2 3 4
020
4060
8010
0
Placebo
Period
Num
ber
of s
eizu
res
●●
●
● ●●
●
●
●
●
●
●
●●
●
1 2 3 4
020
4060
8010
0
Progabide
Period
Num
ber
of s
eizu
res
Longitudinal Data Analysis 245
●
1 2 3 4
01
23
4
Placebo
Period
Log
num
ber
of s
eizu
res
●
●
●
●
1 2 3 4
01
23
4
Progabide
Period
Log
num
ber
of s
eizu
res
Longitudinal Data Analysis 246
fm <- seizure.rate ∼ base + age + treatment + offset(per)
epilepsy glm <- glm(fm, data = epilepsy, family = "poisson")
Longitudinal Data Analysis 247
R Console Page 1
> summary(epilepsy_glm)
Call:glm(formula = fm, family = "poisson", data = epilep sy)
Deviance Residuals: Min 1Q Median 3Q Max -4.4360 -1.4034 -0.5029 0.4842 12.3223
Coefficients: Estimate Std. Error z value Pr (>|z|) (Intercept) -0.1306156 0.1356191 -0.963 0 .33549 base 0.0226517 0.0005093 44.476 < 2e-16 ***age 0.0227401 0.0040240 5.651 1. 59e-08 ***treatmentProgabide -0.1527009 0.0478051 -3.194 0 .00140 ** ---Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘. ’ 0.1 ‘ ’ 1
(Dispersion parameter for poisson family taken to b e 1)
Null deviance: 2521.75 on 235 degrees of free domResidual deviance: 958.46 on 232 degrees of free domAIC: 1732.5
Number of Fisher Scoring iterations: 5
Longitudinal Data Analysis 248
epilepsy gee1 <- gee(fm, data = epilepsy, family = "poisson",id = subject,
corstr = "independence", scale.fix = TRUE,scale.value = 1)
Longitudinal Data Analysis 249
R Console Page 1
> summary(epilepsy_gee1)
GEE: GENERALIZED LINEAR MODELS FOR DEPENDENT DATA gee S-function, version 4.13 modified 98/01/27 (1998)
Model: Link: Logarithm Variance to Mean Relation: Poisson Correlation Structure: Independent
Call:gee(formula = fm, id = subject, data = epilepsy, family = "poisson", corstr = "independence", scale.fix = TRUE, scale.value = 1)
Summary of Residuals: Min 1Q Median 3Q Max -4.9195387 0.1808059 1.7073405 4.8850644 69.9658560
Coefficients: Estimate Naive S.E. Naive z Robust S.E. Robust z(Intercept) -0.13061561 0.1356191185 -0.9631062 0.365148155 -0.3577058base 0.02265174 0.0005093011 44.4761250 0.001235664 18.3316325age 0.02274013 0.0040239970 5.6511312 0.011580405 1.9636736treatmentProgabide -0.15270095 0.0478051054 -3.1942393 0.171108915 -0.8924196
Estimated Scale Parameter: 1Number of Iterations: 1
Working Correlation [,1] [,2] [,3] [,4][1,] 1 0 0 0[2,] 0 1 0 0[3,] 0 0 1 0[4,] 0 0 0 1
Longitudinal Data Analysis 250
epilepsy gee2 <- gee(fm, data = epilepsy, family = "poisson",id = subject,
corstr = "exchangeable", scale.fix = TRUE,scale.value = 1)
Longitudinal Data Analysis 251
R Console Page 1
> summary(epilepsy_gee2)
GEE: GENERALIZED LINEAR MODELS FOR DEPENDENT DATA gee S-function, version 4.13 modified 98/01/27 (1998)
Model: Link: Logarithm Variance to Mean Relation: Poisson Correlation Structure: Exchangeable
Call:gee(formula = fm, id = subject, data = epilepsy, family = "poisson", corstr = "exchangeable", scale.fix = TRUE, scale.value = 1)
Summary of Residuals: Min 1Q Median 3Q Max -4.9195387 0.1808059 1.7073405 4.8850644 69.9658560
Coefficients: Estimate Naive S.E. Naive z Robust S.E. Robust z(Intercept) -0.13061561 0.2004416507 -0.651639 0.365148155 -0.3577058base 0.02265174 0.0007527342 30.092612 0.001235664 18.3316325age 0.02274013 0.0059473665 3.823564 0.011580405 1.9636736treatmentProgabide -0.15270095 0.0706547450 -2.161227 0.171108915 -0.8924196
Estimated Scale Parameter: 1Number of Iterations: 1
Working Correlation [,1] [,2] [,3] [,4][1,] 1.0000000 0.3948033 0.3948033 0.3948033[2,] 0.3948033 1.0000000 0.3948033 0.3948033[3,] 0.3948033 0.3948033 1.0000000 0.3948033[4,] 0.3948033 0.3948033 0.3948033 1.0000000
Longitudinal Data Analysis 252
epilepsy gee3 <- gee(fm, data = epilepsy, family = "poisson",id = subject,
corstr = "exchangeable", scale.fix = FALSE,scale.value = 1)
Longitudinal Data Analysis 253
R Console Page 1
> summary(epilepsy_gee3)
GEE: GENERALIZED LINEAR MODELS FOR DEPENDENT DATA gee S-function, version 4.13 modified 98/01/27 (1998)
Model: Link: Logarithm Variance to Mean Relation: Poisson Correlation Structure: Exchangeable
Call:gee(formula = fm, id = subject, data = epilepsy, family = "poisson", corstr = "exchangeable", scale.fix = FALSE, scale.value = 1)
Summary of Residuals: Min 1Q Median 3Q Max -4.9195387 0.1808059 1.7073405 4.8850644 69.9658560
Coefficients: Estimate Naive S.E. Naive z Robust S.E. Robust z(Intercept) -0.13061561 0.452199543 -0.2888451 0.365148155 -0.3577058base 0.02265174 0.001698180 13.3388301 0.001235664 18.3316325age 0.02274013 0.013417353 1.6948302 0.011580405 1.9636736treatmentProgabide -0.15270095 0.159398225 -0.9579840 0.171108915 -0.8924196
Estimated Scale Parameter: 5.089608Number of Iterations: 1
Working Correlation [,1] [,2] [,3] [,4][1,] 1.0000000 0.3948033 0.3948033 0.3948033[2,] 0.3948033 1.0000000 0.3948033 0.3948033[3,] 0.3948033 0.3948033 1.0000000 0.3948033[4,] 0.3948033 0.3948033 0.3948033 1.0000000
Longitudinal Data Analysis 254
Generalized Linear Mixed E�ects Models
• GLMs can be extended, with the inclusion of random parameters, to allow variation betweensubjects
• Random e�ects follow multivariate normal distribution
• Conditional on random e�ects, responses are independent following a distribution thatbelongs to the exponential family.
Longitudinal Data Analysis 255
Model Speci�cation:
• The distribution of Yij, conditional to random e�ects, belongs to the exponential familyof distributions.
• It's variance isV ar(Yi) = �v(E[Yij|bi])
• Given bi, Yij are independent from one another
• In matrix notation, the linear predictor can be written
�ij = X′ij� + Z
′ijbi;
and for some known link function g()
g(E[Yij|bi]) = �ij = X′ij� + Z
′ijbi;
Longitudinal Data Analysis 256
• Random e�ects, in theory, can follow any multivariate distribution. In practice, they followmultivariate normal with mean equal zero and a covariance matrix G.
Longitudinal Data Analysis 257
GLMM for Continuous Response:
• Responses Yij are independent, conditional on bi, and normally distributed
• Variance has the formV ar(Yij|bi) = �2;
where � = �2 and v(�) = 1.
• The linear predictor is
�ij = X′ij� + Z
′ijbi;
where X′ij = Z
′ij = (1; tij) (illustration). Then
E(Yij|bi) = �ij = X′ij� + Z
′ijbi
= (�1 + b1i) + (�2 + b2i)tij:
• Although the link is the identity function, more options are available
Longitudinal Data Analysis 258
• Random e�ects have a bi-variate Normal with covariance matrix G2×2
Longitudinal Data Analysis 259
GLMM for Binary Response:
• Responses Yij are independent, conditional on bi, Bernoulli variables
• Variance has the form
V ar(Yij|bi) = E(Yij|bi)(1− E(Yij|bi)):
This means that � = 1.
• The linear predictor is given by
�ij = X′ij� + Z
′ijbi
= X′ij� + bi;
Longitudinal Data Analysis 260
where Z′ij = 1 for all i; j (illustration). Then
log
[P (Yij = 1|bi)P (Yij = 0|bi)
]= �ij = X
′ij� + bi
• bi ∼ N(0; �2).
• This is a random intercept model, equivalent to the compound symmetry model.
Longitudinal Data Analysis 261
GLMM for Counts:
• Responses Yij are independent, conditional on bi, following Poisson distribution
• Variance has the formV ar(Yij|bi) = E(Yij|bi):
This means that � = 1.
• The linear predictor is given by
�ij = X′ij� + Z
′ijbi;
where Z′ij = (1; tij) for all i; j (illustration). Then
logE(Yij|bi) = �ij = X′ij� + Z
′ijbi
• Random e�ects follow bivariate normal with zero mean and 2x2 covariance matrix
Longitudinal Data Analysis 262
Parameter Interpretation
• Parameters in the linear predictor are now interpreted in terms of conditional probabilities,given subject (random) e�ects
• Regression parameters � in GLMM have di�erent interpretation than in marginal models
• In GLMM, � represent subject-speci�c interpretation
• Speci�cally, � represent the impact of covariates on changes in an individual's transformedmean response
Longitudinal Data Analysis 263
• Consider the example with the logistic regression model
log
[P (Yij = 1|bi)P (Yij = 0|bi)
]= X
′ij� + bi;
where bi ∼ N(0; g11). Furthermore, consider covariate Xijk takes some value x, leadingto the log-odds
log
[P (Yij = 1|bi; Xij1; Xij2; :::; Xijk = x; :::; Xijp)
P (Yij = 0|bi; Xij1; Xij2; :::; Xijk = x; :::; Xijp)
]= �1Xij1 + �2Xij2 + ::: + �kx + ::: + �pXijp + bi:
Additionally, if Xijk = x + 1, then the log-odds takes the form
log
[P (Yij = 1|bi; Xij1; Xij2; :::; Xijk = x + 1; :::; Xijp)
P (Yij = 0|bi; Xij1; Xij2; :::; Xijk = x + 1; :::; Xijp)
]= �1Xij1 + �2Xij2 + ::: + �k(x + 1) + ::: + �pXijp + bi;
Longitudinal Data Analysis 264
and hence �k measures the changes in the log-odds resulted from a unit change in covariateXijk while the remaining ones were held �xed. In terms of interpretation:
{ If the covariate Xijk varies within individual (subject-speci�c, time-varying) then
log
[P (Yij′ = 1|bi; Xij′1; Xij′2; :::; Xij′k = x + 1; :::; Xij′p)
P (Yij′ = 0|bi; Xij′1; Xij′2; :::; Xij′k = x + 1; :::; Xij′p)
]− log
[P (Yij = 1|bi; Xij1; Xij2; :::; Xijk = x; :::; Xijp)
P (Yij = 0|bi; Xij1; Xij2; :::; Xijk = x; :::; Xijp)
]= �k;
where the interpretation is quite straight forward since all other covariates as well asrandom e�ects are the same and hence removed. Hence,
log[P(Yij′=1|bi;:::)=P(Yij′=0|bi;:::)P(Yij=1|bi;:::)=P(Yij=0|bi;:::)
]= logOR = �k ⇒
OR = exp(�k)
is the within subject OR.
Longitudinal Data Analysis 265
{ If the covariate Xijk is time invariant (between-subject), like treatment group,interpretation becomes complicated. Hence
log
[P (Yij = 1|bi; Xij1; Xij2; :::; Xijk = 1; :::; Xijp)
P (Yij = 0|bi; Xij1; Xij2; :::; Xijk = 1; :::; Xijp)
]− log
[P (Yi′j = 1|bi′; Xi′j1; Xi′j2; :::; Xi′jk = 0; :::; Xi′jp)
P (Yi′j = 0|bi′; Xi′j1; Xi′j2; :::; Xi′jk = 0; :::; Xi′jp)
]= �k + (bi − bi′);
and as a result the change in log-odds is confounded by bi− bi′. It is misleading to giveto this change a subject-speci�c interpretation. It is seen as a model based extrapolation(no data available) and could be sensitive to various assumptions concerning the randome�ects.
Longitudinal Data Analysis 266
Estimation and Inference
• The distribution of the random e�ects as well as the distribution of the responses areknown
• As a result, the joint distribution of random e�ects and responses is fully speci�ed
f(Yi; bi) = f(Yi|bi)f(bi);
wheref(Yi|bi) = f(Yi1|bi) f(Yi2|bi) : : : f(Yini|bi)
under the conditional independence assumption.
Longitudinal Data Analysis 267
• Then, the likelihood function takes the form
L(�; �; G) =
N∏i=1
∫f(Yi|bi)f(bi)dbi;
where the random e�ects are integrated out of the likelihood, obtaining in that way amarginal likelihood averaged over the bi.
• There is now way the likelihood can be written in a closed form
• As a result, numerical integration techniques are required
Longitudinal Data Analysis 268
Prediction of bi
• Given the MLE of �, � and G, bi can be predicted as
bi = E(bi|Yi; �; �; G)
• This is the empirical Bayes or BLUP used before
• Numerical integration techniques are also required
Longitudinal Data Analysis 269
The lmer function (R: lme4 package)
Longitudinal Data Analysis 270
Fit (Generalized) Linear Mixed-Effects Models
Description
Fit a linear or generalized linear mixed-effects model with nested or crossed grouping factors for the random effects.
Usage
lmer(formula, data, family, method, control, start, subset, weights, na.action, offset, contrasts, model, ...) lmer2(formula, data, family, method, control, start, subset, weights, na.action, offset, contrasts, model, ...)
Arguments
Details
lmer(lme4) R Documentation
formula a two-sided linear formula object describing the fixed-effects part of the model, with the response on the left of a ~ operator and the terms, separated by + operators, on the right. The vertical bar character "|" separates an expression for a model matrix and a grouping factor.
data an optional data frame containing the variables named in formula. By default the variables are taken from the environment from which lmer is called.
family a GLM family, see glm. If family is missing then a linear mixed model is fit; otherwise a generalized linear mixed model is fit.
method a character string. For a linear mixed model the default is "REML" indicating that the model should be fit by maximizing the restricted log-likelihood. The alternative is "ML" indicating that the log-likelihood should be maximized. (This method is sometimes called "full" maximum likelihood.) For a generalized linear mixed model the criterion is always the log-likelihood but this criterion does not have a closed form expression and must be approximated. The default approximation is "PQL" or penalized quasi-likelihood. Alternatives are "Laplace" or "AGQ" indicating the Laplacian and adaptive Gaussian quadrature approximations respectively. The "PQL" method is fastest but least accurate. The "Laplace" method is intermediate in speed and accuracy. The "AGQ" method is the most accurate but can be considerably slower than the others.
control a list of control parameters. See below for details.start a list of relative precision matrices for the random effects. This has the same form
as the slot "Omega" in a fitted model. Only the upper triangle of these symmetric matrices should be stored.
subset, weights, na.action, offset, contrasts
further model specification arguments as in lm; see there for details.
model logical indicating if the model component should be returned (in slot frame).... potentially further arguments for methods. Currently none are used.
Page 1 of 3Fit (Generalized) Linear Mixed-Effects Models
17/04/2008mk:@MSITStore:C:\PROGRA~1\R\R-26~1.1\library\lme4\chtml\lme4.chm::/lmer.html
Longitudinal Data Analysis 271
Example: Respiratory Data
Longitudinal Data Analysis 272
resp lmer1 =lmer(status ∼ centre + treatment + sex + baseline + age +
(1|subject), data = resp, family = "binomial")
Longitudinal Data Analysis 273
R Console Page 1
> summary(resp_lmer1)Generalized linear mixed model fit using Laplace Formula: status ~ centre + treatment + sex + baseli ne + age + (1 | subject) Data: resp Family: binomial(logit link) AIC BIC logLik deviance 443 471.7 -214.5 429Random effects: Groups Name Variance Std.Dev. subject (Intercept) 3.8402 1.9596 number of obs: 444, groups: subject, 111
Estimated scale (compare to 1 ) 0.7770601
Fixed effects: Estimate Std. Error z value Pr(> |z|) (Intercept) -1.64382 0.75668 -2.172 0. 0298 * centre2 1.04635 0.53075 1.971 0. 0487 * treatmenttreatment 2.16087 0.51652 4.183 2.87 e-05 ***sexmale 0.20740 0.65969 0.314 0. 7532 baselinegood 3.07037 0.52499 5.848 4.96 e-09 ***age -0.02549 0.01994 -1.278 0. 2012 ---Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘. ’ 0.1 ‘ ’ 1
Correlation of Fixed Effects: (Intr) centr2 trtmnt sexmal bslngdcentre2 -0.054 trtmnttrtmn -0.407 0.018 sexmale -0.008 -0.151 0.222 baselinegod -0.347 -0.236 0.206 0.101 age -0.753 -0.226 -0.015 -0.255 0.069
Longitudinal Data Analysis 274
resp lmer2 = lmer(status ∼ centre + treatment + sex + baseline + age
+ (age|subject), data = resp, family = "binomial")
Longitudinal Data Analysis 275
R Console Page 1
> summary(resp_lmer2)Generalized linear mixed model fit using Laplace Formula: status ~ centre + treatment + sex + baseli ne + age + (age | subject) Data: resp Family: binomial(logit link) AIC BIC logLik deviance 445.8 482.7 -213.9 427.8Random effects: Groups Name Variance Std.Dev. Corr subject (Intercept) 1.964799 1.401713 age 0.001584 0.039799 0.003 number of obs: 444, groups: subject, 111
Estimated scale (compare to 1 ) 0.7859826
Fixed effects: Estimate Std. Error z value Pr(> |z|) (Intercept) -1.29487 0.72534 -1.785 0. 0742 . centre2 0.99755 0.50953 1.958 0. 0503 . treatmenttreatment 2.01372 0.50179 4.013 5.99 e-05 ***sexmale 0.24017 0.68883 0.349 0. 7273 baselinegood 2.97704 0.51023 5.835 5.39 e-09 ***age -0.03354 0.02107 -1.592 0. 1114 ---Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘. ’ 0.1 ‘ ’ 1
Correlation of Fixed Effects: (Intr) centr2 trtmnt sexmal bslngdcentre2 -0.084 trtmnttrtmn -0.396 0.013 sexmale 0.053 -0.130 0.215 baselinegod -0.337 -0.226 0.217 0.076 age -0.753 -0.173 -0.038 -0.316 0.042
Longitudinal Data Analysis 276