Dynamic Factor Analysis
Ellen L. Hamaker
Methods and StatisticsFaculty of Social Sciences
Utrecht UniversityThe Netherlands
Outline
i. Introduction
ii. Time series analysis
iii. Linear Kalman filter
iv. Illustration 1
v. Regime-switching Kalman filter
vi. Illustration 2
vii. Discussion
2 kinds of statistical techniques
Concerning means of populationst-test
ANOVAMANOVA
Concerning covariance structure of populationscorrelation
regression analysisfactor analysispath analysis
Means and covariance structures combined in SEM
How did it start?
In 1884 Galton established his anthropometric
laboratory and measured mental faculties and
physical appearances of 9000 visitors. His research
subject was: variation in the population.
Galton believed most mental and physical features
were inherited.
He was worried that the protection of the weak (i.e.,
the poor) would interfere with the mechanisms of
natural selection.
Galton is the founder of eugenics.
Other important eugenicists
Pearson follower of Galton, and inventor of the product-moment correlation coefficient
Spearman student of Wundt, and inventor of factor analysis, and the concept of general intelligence
Fisher mathematician, and inventor of: ANOVA, experimental designs, principle of maximum likelihood, inferential statistics, null-hypothesis testing, F-test, Fisher information, non-parametric statistics, et cetera, et cetera…
Mathematical statistics
The statistical techniques used in the social sciences were developed to study heredity.
Hence, they have two important features:
a. heredity operates at level of population: same holds for these techniques
b. biometrics is concerned with studying trait-like variables, not processes
What is the problem?
Our standard techniques focus on characteristics of the population (means, correlations, proportions).
BUT… results are not always generalizable to the individual.
For instance:- if we find a beneficial effect of therapy at the group level, this
does not guarantee that every individual improved
- if we find a smooth change at the group level, it is possible that at the individual level there is a sudden change
- if 20% of clients are cured after treatment, this does not imply that an individual has a 20% change of being cured
E.g., correlation
words perminute
words perminute
mis
take
s
interindividual intraindividual
mis
take
s
Who makes this mistake?
sociable
shy
Personality processes, by definition, involve some change in thoughts, feelings and actions of an individual; all these intra-individual changes seem to be mirrored by interindividual differences in characteristic ways of thinking, feeling and acting.
McCrae & John (1992)
))(( 2,,
2,,
,,,
2222
iyiyixix
ixyiyix
yx
xyxy
The same in formulas
Let i be the subject index, and x and y be two variables.
INTRAindividual correlation:
INTERindividual correlation
2,
2,
,,
iyix
ixyixy
Questions about processes
Is the relationship at the INTRAindividual level identical to the relationship at the INTERindividual level?
If not, is there an universal relationship?
If not, can the differences between individuals with respect to their dynamics be related to other individual differences?
Outline
i. Introduction
ii. Time series analysis
iii. Linear Kalman filter
iv. Illustration 1
v. Regime-switching Kalman filter
vi. Illustration 2
vii. Discussion
Dynamic system
A DS is a set of equations that describe how the state of the system changes as a function of its previous state.
Characteristics of a DS:- 1 or more variables- state = values of the variables- stochastic/deterministic- discrete or continuous time- linear or nonlinear
Time series analysis is a technique to study uni- or multivariate, stochastic systems in discrete time, which may be linear or nonlinear.
Autoregressive modelst1t1t ayy t2t21t1t ayyy
ytyt-1yt-2 yt+1 yt+2
at-2
t1t1t AYY
at-1 at at+1 at+2
y*ty*t-1y*t-2 y*t+1 y*t+2
a*t-2 a*t-1 a*t a*t+1 a*t+2
Time series
-3-2
-10
12
3
Se
rie
s 1
-3-2
-10
12
0 10 20 30 40 50 60 70
Se
rie
s 2
Time
x
Unrelated series:first series contains autocorrelation second series is white noise
Two related series:first contains positive autocorrelationsecond contains negative autocorrelation
Dynamic factor model
A DFM relates multiple indicators to 1 or more latent variables (factor model).
Because the variables are measured repeatedly (T>50), the dynamics can be modeled (i.e., the structure in the changes over time).
Two ways of including lagged relationships:
- lagged factor loadings
- latent VARMA process
DFM with lagged factor loadings
yt+1 yt+1 yt+1 yt+1yt yt yt ytyt-1 yt-1 yt-1 yt-1
ft ft+1ft-1
yt-2 yt-2 yt-2 yt-2
ft-2
tqt
Q
qqt efy
0
tttt effy 110
DFM with latent VARMA process
ttt efy tit
p
jit aff
1
yt+1 yt+1 yt+1 yt+1yt yt yt ytyt-1 yt-1 yt-1 yt-1
ft ft+1ft-1
yt-2 yt-2 yt-2 yt-2
ft-2
at-1 at at+1at-1
tttt afff 2211
Outline
i. Introduction
ii. Time series analysis
iii. Linear Kalman filter
iv. Illustration 1
v. Regime-switching Kalman filter
vi. Illustration 2
vii. Discussion
Kalman filterThe Kalman filter is an algorithm for estimating the latent states, and for predicting time series models.
It requires the model to be reformulated in state-
space format, i.e.:
ttt edWay
ttt GzcHaa 1
),(~ R0e Nt
),(~ Q0z Nt
t = T ?
0|00|0 and Choose Pa
GQG'H'HPP
cHaa
Pa
0|00|1
0|00|1
11 : and Predict
RH'HPF
dWay
Fy
0|11
0|10|1
11 compute and Predict
0|110|1
error prediction ahead
step-one theDetermine
yye 0|1
110|10|11|1
0|11
10|10|11|1
11 and of estimate Update
SP'FW'PPP
eFW'Paa
Pa
GQG'H'HPP
cHaa
Pa
1|11|2
1|11|2
22 : and Predict
RH'HPF
dWay
Fy
1|22
1|21|2
22 compute and Predict
1|221|2
error prediction ahead
step-one theDetermine
yye 1|2
121|21|22|2
1|21
11|21|22|2
22 and of estimate Update
SP'FW'PPP
eFW'Paa
Pa
t = T ?
ttt edWay
ttt GzcHaa 1
Goal of Kalman filterObtain estimates for the states at (and predict future
observations).
t = T ?
0|00|0 and Choose Pa1)0|ˆ(let and ˆChoose )()( ii L θθ
GQG'H'HPP
cHaa
Pa
1|11|
1|11|
: and Predict
tttt
tttt
tt
RH'HPF
dWay
Fy
1|
1|1|
: compute and Predict
ttt
tttt
tt
1|1|
error prediction ahead
step-one theDetermine
ttttt yye1|
11|1||
1|1
1|1||
and of prediction Update
ttttttttt
ttttttttt
tt
SP'FW'PPP
eFW'Paa
Pa
1|1
1|2/12/}(}(
1|
2
1exp)det()2(*)1|ˆ()|ˆ(
:occasion t offunction likelhood in the Enter
ttttttNYii
tt
tLtL eFe'Fθθ
e
Estimation of model parameters
Outline
i. Introduction
ii. Time series analysis
iii. Linear Kalman filter
iv. Illustration 1
v. Regime-switching Kalman filter
vi. Illustration 2: nonlinear KF extension
vii. Discussion
Daily measures of E & NData: 90 repeated measures in 22 subjects of states
associated with the Five Factor Model of personality.
0
1
2
3
irritable emotionallystable
calm badtempered
resistant vulnerable
Extraversion items Neuroticism items
total variancestate variancetrait variance
0
1
2
3
dynamic sociable shy silent lively reserved
Results
1. Does every one have the same 2-factor structure?- 3 persons out of 22 not- only small groups with same factor loadings
2. Are there similarties in dynamics?
NtNt-1
EtEt-1
at-1
ut-1
at
ut
+
+
--+
-
Outline
i. Introduction
ii. Time series analysis
iii. Linear Kalman filter
iv. Illustration 1
v. Regime-switching Kalman filter
vi. Illustration 2
vii. Discussion
State-space model with regime-switching
Regimes can be thought of as states that differ from each other with respect to their parameters.
where St is an unobserved discrete-valued Markov
chain.
),(~ ttt SttStSt N R0eedaWy
),(~ 1 tttt SttSStSt N Q0zzGcaHa
Markov-switching process
Let’s focus on a 2-regimes first-order Markov-switching process. Thus we have: St = 1,2.
For each regime there is a probability of staying in the same regime, and a probability of switching to the other regime.
1|1Pr 111 tt SSp
1|2Pr 112 tt SSp
2|1Pr 121 tt SSp
2|2Pr 122 tt SSp
2212
2111
pp
ppp
KF with Markov-switching
Because we do not know in which regime the process is at any occasion, we have to estimate all possibilities.
Hence, we get 4 (M*M) predictions
and 4 updates:
)(1|1
ittjj aHc 11
),(1| ,,| ttttjitt iSjSE Yaa
ttttjitt iSjSE Yaa ,,| 1
),(|
),(1|
1),(1|
'),(1
),(1|1
jitt
jittj
jit|t
jitt
eFWPa
Collapsing the posteriors
This implies that at each step we get an M-fold increase in cases (2,4,8,16,32,…).
To overcome this problem, the M2 updates are reduced to M updates through:
Hence, to collapse the M2 posteriors in M posteriors, we need the probabilities Pr[St-1 = i|St = j, Yt]. These
are obtained with the Hamilton filter.
),(|jitta
M
iittt
jitt
jtt jSiS Yaa ,|Pr 1
),(||
jtt|a
Hamilton filter of the probabilities
1111 |Pr|,Pr ttijttt iSpjSiS YY
M
i
M
jtttttt jSiSff
1 1111 )|,,()|( YyYy
)|(
)|,,(,|,Pr
1
1111
tt
tttttttt f
jSiSfjSiS
Yy
YyyY
1111
11
|,Pr),,|(
)|,,(
ttttttt
tttt
jSiSjSiSf
jSiSf
YYy
Yy
M
ittt
tttttt
jSiS
jSiSjSiS
11
11
|,Pr
|,Pr,|Pr
Y
YY
Outline
i. Introduction
ii. Time series analysis
iii. Linear Kalman filter
iv. Illustration 1
v. Regime-switching Kalman filter
vi. Illustration 2
vii. Discussion
Positive and negative affect
Daily measurements with palm handheld using the PANAS.
Question: Are there distinct regimes in daily affect fluctuations?
Time
Se
rie
s 1
0 20 40 60 80 100
1.0
1.5
2.0
2.5
Time
Se
rie
s 1
0 20 40 60 80 100
1.0
1.5
2.0
2.5
3.0
3.5
4.0
Positive affect Negative affect
Negative affect subject 10
Linear model:
AIC: 108.52
BIC: 115.95
ttt aNANA 123.15.1
tt
ttt aNA
aNANA
1
1
05.22.2
12.13.1
Two regime model:
AIC: 72.32
BIC: 92.05
1.0
1.5
2.0
2.5
Ser
ies
1
0.0
0.2
0.4
0.6
0.8
1.0
0 20 40 60 80 100
Ser
ies
2
Time
k
Negative affect subject 5
Linear model:
AIC: 80.79
BIC: 88.35
ttt aNANA 122.61.1
tt
ttt aNA
aNANA
1
1
00.07.2
21.14.1
Two regime model:
AIC: 69.04
BIC: 89.21
1.0
1.5
2.0
2.5
Ser
ies
1
0.0
0.2
0.4
0.6
0.8
1.0
0 20 40 60 80 100
Ser
ies
2
Time
matrix(cbind(y, round(pr[2:108, 2], 0)), 107, 2)
Outline
i. Introduction
ii. Time series analysis
iii. Linear Kalman filter
iv. Illustration 1
v. Regime-switching Kalman filter
vi. Illustration 2
vii. Discussion
Conclusion
Today we looked at models for:
- multiple indicators
- multiple subjects
- regime switching
TSA allows us to model processes where they take place: at the level of the individual.
There are different ways in which we can combine information obtained from multiple subjects.
Ain’t seen nothing yet!
Other possibilities:- transition probabilities as functions of observed variables
- smoothly changing parameters
- deterministic trends and cycles (weekly, monthly)
- difference scores
- intervention analysis
- change-point models
- threshold models
- ordinal data
- include predictors (situational features)
- include a partner (spouses, therapist-client, mother-child)
- and much much more…
Thank youemail: [email protected]
Top Related