Dynamic Structural Equation Modeling of Intensive ... 3 and 4 Muthen.pdf · Cross-classified time...
Transcript of Dynamic Structural Equation Modeling of Intensive ... 3 and 4 Muthen.pdf · Cross-classified time...
Dynamic Structural Equation Modelingof Intensive Longitudinal Data
Using Multilevel Time Series Analysisin Mplus Version 8
Parts 3 and 4
Bengt [email protected]
Tihomir Asparouhov & Ellen Hamaker
Workshop at Johns Hopkins University, August 17 - 18, 2017
We thank Noah Hastings for excellent assistance
Bengt Muthen DSEM 1/ 96
Mplus Version 8:Methods for Analyzing Intensive Longitudinal Data
Time series analysis (N = 1)
Two-level time series analysis (N > 1)
Random effects varying across subjects (subject is level 2, somany more random effects than usual)
Cross-classified time series analysis
Random effects varying across subjects and timeDynamic Structural Equation Modeling (DSEM)
General latent variable modelingBayesian estimationStatistical background:
Asparouhov, Hamaker & Muthen (2017). Dynamic structuralequation models. Technical Report, www.statmodel.comAsparouhov, Hamaker & Muthen (2017). Dynamic latent classanalysis. Structural Equation Modeling, 24, 257-269
The Version 8 Mplus User’s Guide adds N=1 examples 6.23 - 6.28 and N > 1
examples 9.30 - 9.40, many with two parts (basic and advanced). Also webpageBengt Muthen DSEM 2/ 96
Outline
Introduction to Bayesian analysis
Introduction to longitudinal analysis
N=1 time series analysis
Two-level time series analysis
Cross-classified time series analysis
Latent variable time series analysis
Bengt Muthen DSEM 3/ 96
Prelude 1: Bayesian Analysis - A Thumbnail Sketch
All that’s needed:
ESTIMATOR = BAYES;
Bayesian advantages over ML
An example: Estimating a mean
Convergence of Bayes iterations
Trace and autocorrelation plots
Speed of Bayes in Mplus
Bayes references
Bengt Muthen DSEM 4/ 96
Bayesian Analysis: Advantages over ML
Six key advantages of Bayesian analysis over frequentist analysisusing maximum likelihood estimation:
1 More can be learned about parameter estimates and model fit2 Large-sample theory is not needed and small-sample performance
is better3 Parameter priors can better reflect results of previous studies4 Analyses are in some cases less computationally demanding, for
example, when maximum-likelihood requires high-dimensionalnumerical integration
5 In cases where maximum-likelihood computations areprohibitive, Bayes with non-informative priors can be viewed as acomputing algorithm that would give essentially the same resultsas maximum-likelihood if maximum-likelihood estimation werecomputationally feasible
6 New types of models can be analyzed where themaximum-likelihood approach is not practical (e.g. DSEM)
Bengt Muthen DSEM 5/ 96
Why Are Bayesian Computations PossibleWhere ML Computations Are Not?
The general modeling features of DSEM make ML almost impossible,creating the need for Bayesian estimation.
An intuitive description of the computational difference between ML andBayes (with non-informative priors):
ML works with the joint distribution of all variables to find theparameter values that give the logL maximum
Bayes works with a series of conditional distributions for theparameters to get (posterior) parameter distributions
The joint distribution can be difficult to describe whereas theconditional distributions can be easier
Bayes is sometimes the only feasible alternative when the jointdistribution is hard to formulate
Bengt Muthen DSEM 6/ 96
Prior
Posterior
Likelihood
Parameter
Figure: Informative prior
Prior
Posterior
Likelihood
Parameter
Figure: Non-informative prior
Priors:Non-informative priors (diffuse priors): Large variance (default inMplus)
A large variance reflects large uncertainty in the parameter value.As the prior variance increases, the Bayesian estimate gets closerto the maximum-likelihood estimate
Weakly informative priors: Used for technical assistanceInformative priors:
Informative priors reflect prior beliefs in likely parameter valuesThese beliefs may come from substantive theory combined withprevious studies of similar populations
Bengt Muthen DSEM 7/ 96
Bayes’ Advantage Over ML: Informative Priors
Frequentists sometimes object to Bayes using informative priors
But they already do use such priors in many cases in unrealisticways (e.g. factor loadings fixed exactly at zero)
Bayes can let informative priors reflect prior studies
Bayes can let informative priors identify models that are unidentifiedby ML which is useful for model modification (BSEM)
The credibility interval for the posterior distribution is narrower withinformative priors
Bengt Muthen DSEM 8/ 96
An MCMC Example: LSAY Math with Missing Data
n
n2 missing
math7 math10
1
Three sets of unknowns assuming bivariatenormality:
2 means2 variances, and 1 covariancen2 missing values on math10
1 Draw values for the two means from the conditional distribution of the meansconditioned on the variance-covariance parameters, the observed and missingdata, and the priors.
2 Draw values for the n2 missing values on math10 from the conditionaldistribution of missing values conditioned on the mean parameters, theobserved data, and the priors.
3 Draw values for the two variance and covariance parameters from theconditional distribution of the variance-covariance parameters conditioned onthe mean parameters, the observed and missing data, and the priors.
Bengt Muthen DSEM 9/ 96
Trace Plot: 20 MCMC Iterations for the LSAY Math10 Mean
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
MCMC iteration
62
62.2
62.4
62.6
62.8
63
63.2
63.4
63.6
63.8
64 M
ath1
0 m
ean
Starting value for the mean is the listwise estimate of 63.7
Bengt Muthen DSEM 10/ 96
Forming the Posterior Distribution of a Parameter Estimate
Bengt Muthen DSEM 11/ 96
Posterior Distributions of the LSAY Math10 MeanUsing Different Number of MCMC Iterations
62.7
26
62.7
413
62.7
567
62.7
721
62.7
874
62.8
028
62.8
181
62.8
335
62.8
488
62.8
642
62.8
795
62.8
949
62.9
103
62.9
256
62.9
41
62.9
563
62.9
717
62.9
87
63.0
024
63.0
178
63.0
331
63.0
485
63.0
638
63.0
792
63.0
945
63.1
099
63.1
252
63.1
406
63.1
56
63.1
713
63.1
867
63.2
02
63.2
174
63.2
327
63.2
481
63.2
634
63.2
788
63.2
942
63.3
095
63.3
249
63.3
402
63.3
556
63.3
709
63.3
863
63.4
017
63.4
17
63.4
324
63.4
477
63.4
631
63.4
784
Y2
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
5
5.5
6
6.5
7
Count
(a) 10 iterations
62.1
62.1
5
62.2
62.2
5
62.3
62.3
5
62.4
62.4
5
62.5
62.5
5
62.6
62.6
5
62.7
62.7
5
62.8
62.8
5
62.9
62.9
5
63
63.0
5
63.1
63.1
5
63.2
63.2
5
63.3
63.3
5
63.4
63.4
5
63.5
63.5
5
63.6
63.6
5
63.7
63.7
5
63.8
63.8
5
63.9
63.9
5
64
64.0
5
64.1
64.1
5
64.2
Estimate
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
5
5.5
6
6.5
7
Count
(b) 100 iterations
62
62.0
5
62.1
62.1
5
62.2
62.2
5
62.3
62.3
5
62.4
62.4
5
62.5
62.5
5
62.6
62.6
5
62.7
62.7
5
62.8
62.8
5
62.9
62.9
5
63
63.0
5
63.1
63.1
5
63.2
63.2
5
63.3
63.3
5
63.4
63.4
5
63.5
63.5
5
63.6
63.6
5
63.7
63.7
5
63.8
63.8
5
63.9
63.9
5
64
64.0
5
Estimate
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
Count
(c) 500 iterations
61.9
5
62
62.0
5
62.1
62.1
5
62.2
62.2
5
62.3
62.3
5
62.4
62.4
5
62.5
62.5
5
62.6
62.6
5
62.7
62.7
5
62.8
62.8
5
62.9
62.9
5
63
63.0
5
63.1
63.1
5
63.2
63.2
5
63.3
63.3
5
63.4
63.4
5
63.5
63.5
5
63.6
63.6
5
63.7
63.7
5
63.8
63.8
5
63.9
63.9
5
64
64.0
5
64.1
Estimate
0
20
40
60
80
100
120
140
160
180
200
220
240
260
280
300
320
Count
(d) 10,000 iterations
Bengt Muthen DSEM 12/ 96
Bayes Posterior Distribution Similar to ML BootstrapDistribution: Credibility versus Confidence Intervals
-0.2
5
-0.2
-0.1
5
-0.1
-0.0
5
0
0.0
5
0.1
0.1
5
0.2
0.2
5
0.3
0.3
5
0.4
0.4
5
0.5
0.5
5
0.6
0.6
5
0.7
0.7
5
0.8
0.8
5
0.9
0.9
5
1
Indirect effect via import
0
100
200
300
400
500
600
700
800
900
1000
1100
1200
1300
1400
1500
1600
1700
1800
1900
2000
Count
Mean = 0.20362, Std Dev = 0.11463
Median = 0.19504
Mode = 0.15029
95% Lower CI = 0.00173
95% Upper CI = 0.45225
-0.2
5
-0.2
-0.1
5
-0.1
-0.1
0
0.1
0.1
0.1
5
0.2
0.2
5
0.3
0.3
5
0.4
0.4
5
0.5
0.5
5
0.6
0.6
5
0.7
0.7
5
0.8
0.8
5
0.9
0.9
5
1
Indirect effect via import
0
200
400
600
800
1000
1200
1400
1600
1800
2000
Count
Point estimate = 0.20334
Median = 0.19606
95% Lower CI = 0.00562
95% Upper CI = 0.45779
Bengt Muthen DSEM 13/ 96
Convergence: Trace Plot for Two MCMC Chains. PSR
0
5
10
15
20
25
30
35
40
45
50
55
60
65
70
75
80
85
90
95
10
0
10
5 0.8
0.85
0.9
0.95
1
1.05
1.1
1.15
1.2
1.25
1.3
1.35
1.4
1.45
1.5
1.55
1.6
1.65
1.7
1.75
1.8
1.85
1.9
1.95
2
Potential scale reductioncriterion (Gelman &Rubin, 1992):
PSR =
√W +B
W, (1)
where W represents the within-chain variation of a parameter and Brepresents the between-chain variation of a parameter. A PSR valueclose to 1 means that the between-chain variation is small relativeto the within-chain variation and is considered evidence ofconvergence.
Bengt Muthen DSEM 14/ 96
Convergence of the BayesMarkov Chain Monte Carlo (MCMC) Algorithm
Figure: Premature stoppage of Bayes MCMC iterations using the PotentialScale Reduction (PSR) criterion
Bengt Muthen DSEM 15/ 96
TECH8 Screen Printing of Bayes MCMC Iterations
Bengt Muthen DSEM 16/ 96
Trace Plots Indicating Good vs Poor Mixing
0
100
200
300
400
500
600
700
800
900
1000
1100
1200
1300
1400
1500
1600
1700
1800
1900
2000
2100
2200
2300
2400
2500
2600
2700
2800
2900
3000
3100
3200
3300
3400
3500
3600
3700
3800
3900
4000
4100
4200
4300
4400
4500
4600
4700
4800
4900
5000
5100
5200
5300
5400
5500
5600
5700
5800
5900
6000 1.9
2
2.1
2.2
2.3
2.4
2.5
2.6
2.7
2.8
2.9
3
3.1
3.2
3.3
3.4
3.5
3.6
3.7
3.8
3.9
4
4.1
4.2
4.3
0
50
0
10
00
15
00
20
00
25
00
30
00
35
00
40
00
45
00
50
00
55
00
60
00
65
00
70
00
75
00
80
00
85
00
90
00
95
00
10
00
0
10
50
0
11
00
0
11
50
0
12
00
0
12
50
0
13
00
0
13
50
0
14
00
0
14
50
0
15
00
0
15
50
0
16
00
0
16
50
0
17
00
0
17
50
0
18
00
0
18
50
0
19
00
0
19
50
0
20
00
0
20
50
0
21
00
0
21
50
0
22
00
0
22
50
0 -1.8
-1.7
-1.6
-1.5
-1.4
-1.3
-1.2
-1.1
-1
-0.9
-0.8
-0.7
-0.6
-0.5
-0.4
-0.3
-0.2
-0.1
0
0.1
0.2
0.3
0.4
0.5
0.6
Bengt Muthen DSEM 17/ 96
Autocorrelation Plots Indicating Good vs Poor Mixing
1 2 3 4 5 6 7 8 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
-1
-0.9
-0.8
-0.7
-0.6
-0.5
-0.4
-0.3
-0.2
-0.1
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
1 2 3 4 5 6 7 8 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
-1
-0.9
-0.8
-0.7
-0.6
-0.5
-0.4
-0.3
-0.2
-0.1
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Bengt Muthen DSEM 18/ 96
Speed Of Bayes In Mplus
Wang & Preacher (2014). Moderated mediation analysis using Bayesianmethods. Structural Equation Modeling.
Comparison of ML (with bootstrap) and Bayes: Similar statisticalperformance
Comparison of Bayes using BUGS versus Mplus: Mplus is 15 timesfaster
Reason for Bayes being faster in Mplus:
Mplus uses Fortran (fastest computational environment)Mplus uses parallel computing so each chain is computedseparatelyMplus uses the largest updating blocks possible - complicated toprogram but gives the best mixing qualityMplus uses sufficient statistics when possible
Mplus Bayes considerably easier to use
Bengt Muthen DSEM 19/ 96
Nevertheless - It’s Going To Be Slower Than Usual:Timings For The Runs In This Talk
Using smoking data with N = 230, T ≈ 150
N=1 analysis of subject 227: 0 seconds
First two-level analysis: 3:54
Cross-classified analysis spotting a trend: 1:10
Two-level trend analysis: 4:42
Cross-classified trend analysis: 34 minutes
Cross-classified ordinal factor analysis: 54 minutes (dichotomous 34mins, continuous 16 mins)
Bengts PC as of June 2012: Dell XPS 8500, i7-3770 with 8 processors, CPUof 3.40 GHz, 12 GB RAM, 64-bit.
Bengt Muthen DSEM 20/ 96
Bayes References
Gelman et al. (2014). Bayesian Data Analysis, 3rd edition
Lynch (2010). Introduction to Applied Bayesian Statistics andEstimation for Social Scientists
Bayes technical reports on the Mplus website: Seewww.statmodel.com under Papers, Bayesian Analysis
Muthen (2010). Bayesian analysis in Mplus: A brief introduction.Technical Report. www.statmodel.com
Chapter 9 of Muthen, Muthen & Asparouhov (2016). Regression andMediation Analysis using Mplus
Bengt Muthen DSEM 21/ 96
Outline
Introduction to Bayesian analysis
Introduction to longitudinal analysis
N=1 time series analysis
Two-level time series analysis
Cross-classified time series analysis
Latent variable time series analysis
Bengt Muthen DSEM 22/ 96
Prelude 2: Methods for Longitudinal Data
Non-intensive longitudinal data:T small (2 - 10) and N largeModeling: Auto-regressive (cross-lagged) and growth modeling
Intensive longitudinal data:T large (30-200) and N smallish (even N = 1) but can be 1,000.Often T > NModeling: We shall see
Bengt Muthen DSEM 23/ 96
Common Methods for Non-Intensive Longitudinal DataN large and T small (2 - 10):
(1) Auto-Regressive Modeling
y1 y2 y3 y4 y5
Cross-lagged modeling (e.g. y = urge, z = negative affect):
y1 y2 y3 y4 y5
z1 z2 z3 z4 z5
Bengt Muthen DSEM 24/ 96
Recent References for Cross-Lagged Modeling
Extensions of the classic cross-lagged panel model:
Hamaker et al., Psych Methods 2015: The random interceptscross-lagged panel model
Curran et al., J of Consulting & Clinical Psych 2014: Theseparation of between-person and within-person componentsBerry and Willoughby, Child Development 2016: Rethinking thecross-lagged panel model (growth model added)
These models are fitted in Mplus
Bengt Muthen DSEM 25/ 96
Common Methods for Non-Intensive Longitudinal Data:(2) Growth Modeling
●
●
●
●●
●●
●
●● ●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
1 2 3 4 5
05
1015
20
Individual Curves
Time
y ou
tcom
e●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ● ●● ●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
y1 y2 y3 y4 y5
i
s
w
Bengt Muthen DSEM 26/ 96
Growth Modeling with Time-Varying Covariates
y1 y2 y3 y4 y5
i
s
w
x1 x2 x3 x4 x5
Bengt Muthen DSEM 27/ 96
Why Is Regular Growth Modeling Not Sufficient For ILD?
There are 2 problems:
1 Correlation between time points not fully explained by growthfactors alone due to closely spaced measurements -autocorrelation needs to be added
2 Time series are too long causing slow computations
Bengt Muthen DSEM 28/ 96
Solving Problem 1. Add Residual (Auto) Correlation:Growth Modeling In Single-Level, Wide Format Version
y as 5 columns in the data
y1 y2 y3 y4 y5
i
s
w
Within (level-1)Variation across time
Between (level-2)Variation across subject
Mplus User’s Guide ex6.17 - but cumbersome with large T.
Bengt Muthen DSEM 29/ 96
Solving Problem 2. Switch From Single-Level to Two-Level,Long Format Version: y as 1 column in the data
time
w
i
s
s
y
i
Within (level-1)Variation across time
Between (level-2)Variation across subject
i = yb
Mplus User’s Guide ex9.16Bengt Muthen DSEM 30/ 96
Growth Modeling In Two-Level, Long Format
time
w
i
s
s
y
i
Within (level-1)Variation across time
Between (level-2)Variation across subject
i = yb
VARIABLE: CLUSTER = subject;WITHIN = time;BETWEEN = w;
ANALYSIS: TYPE = TWOLEVEL RANDOM;MODEL: %WITHIN%
s | y ON time;%BETWEEN%y s ON w; ! y is the same as iy WITH s;
But where is the autocorrelation? And how can it be made random?
Bengt Muthen DSEM 31/ 96
Solution: Two-Level Time Series Analysis With A TrendAllowing Autocorrelation and Many Time points
time
w
i
s
s
y
i
Within
Between
time
y t
tt-1
t-1
Autoregression for the residuals instead?Hamaker (2005). Conditions for the equivalence of the autoregressive latenttrajectory model and a latent growth curve model with autoregressivedisturbances. Sociological Methods & Research.
Bengt Muthen DSEM 32/ 96
Example: Smoking Cessation (EMA)Overview of Analyses
N = 1 time series analysis
Two-level time series analysis
Cross-classified time series analysis - looking for trends over time andfinding trend functions
Adding trend to two-level time series analysis
Cross-classified time series analysis with a trend
Time-varying effect modeling (TVEM) using cross-classified timeseries analysis
Bengt Muthen DSEM 33/ 96
Outline
Introduction to Bayesian analysis
Introduction to longitudinal analysis
N=1 time series analysis
Two-level time series analysis
Cross-classified time series analysis
Latent variable time series analysis
Bengt Muthen DSEM 34/ 96
EMA Example: Smoking Urge Data
Shiffman smoking cessation data
N = 230, T ≈ 150: Random prompts from Personal Digital Assistant(hand held PC) approx. 5 times per day for a month
Variables: Smoking urge (0-10 scale), negative affect (unhappy,irritable, miserable, tense, discontent, frustrated-angry, sad), gender,age, quit/relapse
Shiyko et al. (2012). Using the time-varying effect model (TVEM) toexamine dynamic associations between negative affect and selfconfidence on smoking urges. Prevention Science, 13, 288-299
Bengt Muthen DSEM 35/ 96
N = 1 Time Series Analysis Of Subjects 227 And 5
Smoking urge plotted against time for subject 227 (didn’t quit)
0
5
10
15
20
25
30
35
40
45
50
55
60
65
70
75
80
85
90
95
100
105
110
115
120
125
130
135
140
145
150
155
160
165
170
175
180
185
190
195
200
TIME
0
1
2
3
4
5
6
7
8
9
10
11
12
UR
GE
Smoking urge plotted against time for subject 5 (did quit)
0
5
10
15
20
25
30
35
40
45
50
55
60
65
70
75
80
85
90
95
100
105
110
115
120
125
130
135
140
TIME
0
1
2
3
4
5
6
7
8
9
10
11
12
UR
GE
Bengt Muthen DSEM 36/ 96
N = 1 Time Series Analysis Of Subjects 227 And 5
Posterior One-Tailed 95% C.I.Estimate S.D. P-Value Lower 2.5% Upper 2.5% Significance
urge ONurge&1 0.112 0.068 0.060 -0.027 0.240negaff 1.196 0.178 0.000 0.810 1.542 *
Interceptsurge 4.882 0.494 0.000 3.899 5.865 *
Residual Variancesurge 5.719 0.635 0.000 4.646 7.070 *
Posterior One-Tailed 95% C.I.Estimate S.D. P-Value Lower 2.5% Upper 2.5% Significance
urge ONurge&1 0.822 0.050 0.000 0.723 0.918 *negaff -0.257 0.408 0.272 -1.087 0.516
Interceptsurge 0.517 0.377 0.074 -0.247 1.230
Residual Variancesurge 2.007 0.272 0.000 1.566 2.617 *
Bengt Muthen DSEM 37/ 96
Time Interval (TINTERVAL option)
Used to create a new time variable and insert missing data recordswhen data are misaligned with respect to time:
due to missed measurement occasions that are not assigned amissing value flagdue to random measurement occasions
The creation of the new time variable involves both substantive andstatistical considerations
For more details, technical discussion and simulations, see Asparouhov,Hamaker, Muthen (2017) at www.statmodel.com.
Bengt Muthen DSEM 38/ 96
A Tinterval Example for One Subject
Observed time given in fractions of a day - the 7 time points of the firstday for one subject are shown in the table below
An interval of 0.08 is used corresponding approximately to 2 hours(2/24 = 0.0833), that is, bin size = 0.08
The lowest observed time is 0.32 (0.32*24 is 7:41 am); this is the midpoint of the first bin, with the new time value 1 used in the analysis
Observed Newtime Bins time Outcome0.32 0.28 - 0.36 1 observed0.39 0.36 - 0.44 2 observed0.51 0.44 - 0.52 3 observed0.59 0.52 - 0.60 4 observed0.62 0.60 - 0.68 5 observed
0.68 - 0.76 6 missing0.77 0.76 - 0.84 7 observed
0.84 - 0.92 8 missing0.93 0.92 - 1.00 9 observed
Bengt Muthen DSEM 39/ 96
N = 1 Time Series Analysis Using Tinterval= timeqd(0.08)
Subject 5 (did quit): Tinterval results in missing data records inserted toresolve different time distances between measurements
0
10
20
30
40
50
60
70
80
90
100
110
120
130
140
150
160
170
180
190
200
210
220
230
240
250
260
270
TIME
0
1
2
3
4
5
6
7
8
9
10
11
12
UR
GE
Posterior One-Tailed 95% C.I.Estimate S.D. P-Value Lower 2.5% Upper 2.5% Significance
Subject 5 without Tintervalurge ONurge&1 0.822 0.050 0.000 0.723 0.918 *negaff -0.257 0.408 0.272 -1.087 0.516
Subject 5 with Tinterval (0.08)urge ONurge&1 0.844 0.037 0.000 0.772 0.917 *negaff -0.158 0.382 0.328 -0.930 0.577
Bengt Muthen DSEM 40/ 96
Mplus Input For Subject 5 Time Series Regression
TITLE: Shiffman smoking urge data N = 1 model for subject 5 (quit=1)DATA: FILE = combined relapsers quitters 03-17-17.csv;VARIABLE: NAMES = subject t day urge craving negaff arousal timeqd
gender age quit;! quit = 1 for quitters, 0 for relapsersUSEVARIABLES = urge negaff;LAGGED = urge(1);MISSING = ALL(999);TINTERVAL = timeqd(0.08);USEOBSERVATIONS = subject EQ 5;IDVARIABLE = recnum;
ANALYSIS: ESTIMATOR = BAYES;PROCESSORS = 2;BITERATIONS = (1000);
MODEL: urge ON urge&1 negaff;negaff;
OUTPUT: TECH1 TECH8 STANDARDIZED TECH4 RESIDUAL;PLOT: TYPE = PLOT3;
Bengt Muthen DSEM 41/ 96
Outline
Introduction to Bayesian analysis
Introduction to longitudinal analysis
N=1 time series analysis
Two-level time series analysis
Cross-classified time series analysis
Latent variable time series analysis
Bengt Muthen DSEM 42/ 96
Two-Level Time Series Analysis (N > 1)
Analysis of all N = 230 smoking data subjectsAllowing for parameter variation across subjects using randomeffects
Bengt Muthen DSEM 43/ 96
Two-Level Time Series Analysis:Regression of Smoking Urge on Negative Affect (na)
Using 4 Random Effects
na na
Within
t-1 t
urge urgephi
logv
t-1 t
syx
Bengt Muthen DSEM 44/ 96
Two-Level Time Series Analysis:Regression of Smoking Urge on Negative Affect (na)
Using 4 Random Effects
na na
Within
t-1 t
urge urgephi
logv
t-1 t
syx
Bengt Muthen DSEM 45/ 96
Two-Level Time Series Analysis:Regression of Smoking Urge on Negative Affect (na)
Using 4 Random Effects
na na
Within
t-1 t
urge urgephi
logv
t-1 t
syx
Bengt Muthen DSEM 46/ 96
Two-Level Time Series Analysis:Regression of Smoking Urge on Negative Affect (na)
Using 4 Random Effects
na na
Within
t-1 t
urge urgephi
logv
t-1 t
syx
Bengt Muthen DSEM 47/ 96
Two-Level Time Series Analysis:Regression of Smoking Urge on Negative Affect (na)
Using 4 Random Effects
na na
Within
t-1 t
urge urgephi
logv
t-1 t
syx
Bengt Muthen DSEM 48/ 96
Two-Level Time Series Analysis:Regression of Smoking Urge on Negative Affect (na)
Using 4 Random Effects
na na
Within
t-1 t
urge urgephi
logv
t-1 t
syx
Bengt Muthen DSEM 49/ 96
Mplus Input for Two-Level Regression Analysis
VARIABLE: NAMES = subject t day urge craving negaff arousal timeqdgender age quit;!quit = 1 for quitters, 0 for relapsersUSEVARIABLES = urge negaff age female;CLUSTER = subject;BETWEEN = female age;WITHIN = negaff;LAGGED = urge(1) negaff(1);MISSING = ALL(999);TINTERVAL = timeqd(0.08);
DEFINE: female = gender - 1;age = (age-44.3)/10.1;CENTER negaff(GROUPMEAN);
ANALYSIS: TYPE = TWOLEVEL RANDOM;ESTIMATOR = BAYES;PROCESSORS = 2;BITERATIONS = (1000);
Bengt Muthen DSEM 50/ 96
Mplus Input for Two-Level Regression Analysis, Cont’d
MODEL: %WITHIN%phi | urge ON urge&1;logv | urge;syx | urge ON negaff;negaff ON negaff&1;%BETWEEN%urge phi logv syx ON female age;urge phi logv syx WITH urge phi logv syx;
OUTPUT: TECH1 TECH8 FSCOMPARISON STANDARDIZEDTECH4 RESIDUAL;
PLOT: TYPE = PLOT3;FACTORS = ALL;
Run time: 3:54 (3:13 without FACTORS=ALL)
Bengt Muthen DSEM 51/ 96
Between-Level Results
female
age
urge
logv
phi+_
syx
Between
phi ON female not significant unless both logv and syx are allowed tobe random
Bengt Muthen DSEM 52/ 96
New Output Warnings
*** WARNINGOne or more individual-level variables have no variation within a cluster for the following clusters.Variable Cluster IDs with no within-cluster variation
URGE 160 12 60 192 186 49
WARNING: PROBLEMS OCCURRED IN SEVERAL ITERATIONS IN THE COMPUTATIONOF THE STANDARDIZED ESTIMATES FOR SEVERAL CLUSTERS. THIS IS MOST LIKELY DUETO AR COEFFICIENTS GREATER THAN 1 OR PARAMETERS GIVING NON-STATIONARY MODELS.SUCH POSTERIOR DRAWS ARE REMOVED. THE FOLLOWING CLUSTERS HAD SUCH PROBLEMS:
160 115 205
BETWEEN-LEVEL FACTOR SCORE COMPARISONS
Results for Factor PHI
Ranking Cluster Factor Score Ranking Cluster Factor Score Ranking Cluster Factor Score1 115 0.941 2 11 0.936 3 205 0.9124 113 0.907 5 138 0.906 6 4 0.901
Bengt Muthen DSEM 53/ 96
Checking the Time Interval Value
The choice of 0.08 gives a warning:THE VALUE SPECIFIED IN THE TINTERVAL OPTION MAY BETOO BIG. THE MAXIMUM DISCREPANCY BETWEEN THEACTUAL TIME AND THE TIME RECODED BY THE TINTERVALOPTION IS 9.517 IN CLUSTER 33.
0.08 corresponds to 2 hours: 2/24 = 0.08
Because in this run 0.08 is represented as one time unit in the analysis,9.517 corresponds to 9.517*2 = 19 hours displacement
Given the large displacement, the data for cluster (subject) 33 shouldbe inspected: first 30 observations made in less than 2-day span!? (aim:5/day)
0.08 should perhaps be changed to say 0.04
Bengt Muthen DSEM 54/ 96
Checking Sensitivity to Tinterval Choices
Univariate analysis of urge ON urge&1 with non-random φ and onlythe mean random
Tinterval φ Coverage Time (secs)0.08 0.325 0.41 220.06 0.344 0.31 250.04 0.373 0.20 40
If an AR(1) model holds, φ with interval t (say 0.04), results in φ 2 forinterval 2t (0.08)
Because 0.3732 is not equal to 0.325, a pure AR(1) model does nothold (for instance, urge ON urge&2 is significant)
Smaller time interval gives more missing data, i.e. lower coverage
10 - 15 % coverage is goodCoverage as low as 5% is ok
Smaller time interval gives longer run time
Bengt Muthen DSEM 55/ 96
Technical Interlude: Ampersand Versus HatWhere Should The Autocorrelation Be Applied?
na na
Within
t-1 t
urge urgephit-1 t
na na
Within
t-1 t
urge urget-1 t
residphi
Bengt Muthen DSEM 56/ 96
Comparing Regular vs Residual Auto-Correlation
Regular AR(1):urge ON negaff;phi | urge ON urge&1;
Residual AR(1) in V8.1:urge ON negaff;phi | urgeˆ ON urgeˆ1;
Model DIC pDRegular AR (for the whole outcome) 45,3727 69623Residual AR 45,6347 70440
DIC: Deviance information criterion
Bayesian counterpart to BIC (lower is better)
pD: Effective number of parameters
Also includes latent variables and missing dataDIC requires a large number of iterations (20K used here); only for continuousoutcomes
Bengt Muthen DSEM 57/ 96
Outline
Introduction to Bayesian analysis
Introduction to longitudinal analysis
N=1 time series analysis
Two-level time series analysis
Cross-classified time series analysis
Latent variable time series analysis
Bengt Muthen DSEM 58/ 96
Cross-Classified Time Series Analysis (N > 1)
Two between-level cluster variables: subject crossed with time (oneobservation for a given subject at a given time point).
Generalization of the two-level model providing more flexibility:random effects can vary across not only subject but also time
Consider the two-level model with a random intercept/mean:
yit = α +αi︸ ︷︷ ︸Between subject
+β yw,it−1 + εit︸ ︷︷ ︸Within subject
. (2)
The corresponding cross-classified model is:
yit = α +αi︸ ︷︷ ︸Between subject
+ αt︸︷︷︸Between time
+β yw,it−1 + εit︸ ︷︷ ︸Within subject
. (3)
The Bayes MCMC algorithm is more complex and considerably slower
Bengt Muthen DSEM 59/ 96
Cross-Classified Analysis:A Quick Way to Spot a Trend in a Variable
TITLE: Shiffman smoking urge data, checking for trend in urgeDATA: FILE = combined relapsers quitters 03-17-17.csv;VARIABLE: NAMES = subject t day urge craving negaff arousal timeqd gender age
quit; !quit = 1 for quitters, 0 for relapsersUSEVARIABLES = urge;CLUSTER = subject timeqd;LAGGED = urge(1);MISSING = ALL(999);TINTERVAL = timeqd(0.08);
ANALYSIS: TYPE = CROSSCLASSIFIED RANDOM;ESTIMATOR = BAYES;PROCESSORS = 2;BITERATIONS = (1000);
MODEL: %WITHIN%urge ON urge&1;%BETWEEN subject%urge;%BETWEEN timeqd%urge;
OUTPUT: TECH1 TECH8;PLOT: TYPE = PLOT3;
FACTORS = urge(50);
Bengt Muthen DSEM 60/ 96
Cross-Classified Analysis of Trend:Time Series Plot of Urge Factor Scores
0
10
20
30
40
50
60
70
80
90
100
110
120
130
140
150
160
170
180
190
200
210
220
230
240
250
260
270
280
290
300
Time
-0.7
-0.6
-0.5
-0.4
-0.3
-0.2
-0.1
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1 B
2a_U
RG
E, m
ean
Run time is only 1:10 with fixed AR(1)The trend can be modeled according to some functional form
In a cross-classified analysisIn a two-level analysis
Bengt Muthen DSEM 61/ 96
Imposing Linear/Quadratic Trend in Cross-Classified
VARIABLE: NAMES = subject t day urge craving negaff arousal timeqd gender agequit; !quit = 1 for quitters, 0 for relapsersUSEVARIABLES = urge time time2;CLUSTER = subject timeqd;BETWEEN = (timeqd) time time2;LAGGED = urge(1);MISSING = ALL(999);TINTERVAL = timeqd(0.08);
DEFINE: time = timeqd/100;time2 = time*time;
ANALYSIS: TYPE = CROSSCLASSIFIED RANDOM;ESTIMATOR = BAYES;PROCESSORS = 2;BITERATIONS = (2000);
MODEL: %WITHIN%urge ON urge&1;%BETWEEN subject%urge;%BETWEEN timeqd%urge;urge ON time time2;
OUTPUT: TECH1 TECH8;PLOT: TYPE = PLOT3;
FACTORS = urge(50);
Bengt Muthen DSEM 62/ 96
Imposing Linear/Quadratic Trend in Cross-Classified
Linear:
0
10
20
30
40
50
60
70
80
90
100
110
120
130
140
150
160
170
180
190
200
210
220
230
240
250
260
270
280
290
300
Time
-1.5
-1.4
-1.3
-1.2
-1.1
-1
-0.9
-0.8
-0.7
-0.6
-0.5
-0.4
-0.3
-0.2
-0.1
0
0.1
0.2
0.3
B2
a_U
RG
E,
mea
n
Quadratic:
0
10
20
30
40
50
60
70
80
90
100
110
120
130
140
150
160
170
180
190
200
210
220
230
240
250
260
270
280
290
300
Time
-1.05
-0.95
-0.85
-0.75
-0.65
-0.55
-0.45
-0.35
-0.25
-0.15
-0.05
0.05
0.15
0.25
0.35
0.45
0.55
0.65
B2
a_U
RG
E,
mea
n
Bengt Muthen DSEM 63/ 96
Modeling the Trend: Recall How Growth ModelingCan Be Transformed From Wide To Long
y1 y2 y3 y4 y5
i
s
w
Within (level-1)Variation across time
Between (level-2)Variation across individual
Bengt Muthen DSEM 64/ 96
Growth Modeling: Two-Level, Long Format Version
Within (level-1)Variation across time
Between (level-2)Variation across individual
time
y
w
i
i
s
s
Bengt Muthen DSEM 65/ 96
Two-Level Time Series Analysis of Smoking Urge dataAdding a Trend for Urge.
- Growth Analysis with a Time-Varying Covariate
urge urge
Within
t-1 t
nat-1 timet-1 na t timet
ssyx
Interpretation of s not the usual one; direct effect at each timeAn alternative formulation places the autoregression on theresiduals (Hamaker, 2005; SM&R), resulting in the usual sinterpretation
Bengt Muthen DSEM 66/ 96
Mplus Input For Two-Level Trend Analysis
VARIABLE: NAMES = subject t day urge craving negaff arousal timeqdgender age quit;USEVARIABLES = urge quit negaff age female time;CLUSTER = subject;BETWEEN = female age quit;CATEGORICAL = quit;WITHIN = time negaff;LAGGED = urge(1) negaff(1);MISSING = ALL(999);TINTERVAL = timeqd(0.08);
DEFINE: female = gender - 1;age = (age-44.3)/10.1;time = timeqd/100-1.305;CENTER negaff(GROUPMEAN);
ANALYSIS: TYPE = TWOLEVEL RANDOM;ESTIMATOR = BAYES;PROCESSORS = 2;BITERATIONS = (1000);
Bengt Muthen DSEM 67/ 96
Mplus Input For Two-Level Trend Analysis, Cont’d
MODEL: %WITHIN%phi | urge ON urge&1;syx | urge ON negaff;logv | urge;s | urge ON time;negaff ON negaff&1;time;%BETWEEN%urge syx s phi logv ON female age;urge syx s phi logv WITH urge syx s phi logv;quit ON urge syx s phi logv female age;
OUTPUT: TECH1 TECH8 FSCOMPARISON STANDARDIZED TECH4RESIDUAL;
PLOT: TYPE = PLOT3;FACTORS = ALL;
Run time: 4:42 (3:56 without FACTORS = ALL)
Bengt Muthen DSEM 68/ 96
Results for Two-Level Regression Analysisof Smoking Urge Data: Adding a Trend for Urge.- Growth Analysis with a Time-Varying Covariate
female
age
urge
logv
phiquit
+
_
+_
_
s
syx
Between
_
Quit (binary) regressed on random effects:
higher urge gives lower quit probability
higher autocorrelation gives higher quit probability
higher residual variance gives lower quit probability
higher trend slope gives lower quit probability
Bengt Muthen DSEM 69/ 96
Time-Varying Effect Modeling (TVEM)Using Cross-Classified Analysis
Cross-classified modeling allows parameters to change over time
An example is a regression slope
Does the influence of negative affect on smoking urge declineover time?
Bengt Muthen DSEM 70/ 96
Cross-Classified Regression Analysis of Smoking Urge Data:Adding a Trend for Urge and the Negaff Regression Slope
Cross-Classified Growth AnalysisWith a Time-Varying Covariate
Between Subject
syxtimet
Between Time
female
age
urge
s
syx quit
Bengt Muthen DSEM 71/ 96
Mplus Input for Cross-Classified Regression Analysiswith an Urge Trend and a Negaff Slope Trend
VARIABLE: NAMES = subject t day urge craving negaff arousal timeqdgender age quit;! quit = 1 for quitters, 0 for relapsersUSEVARIABLES = urge quit negaff age female timew timet;CLUSTER = subject timeqd;BETWEEN = (subject) female age quit (timeqd) timet;CATEGORICAL = quit;WITHIN = negaff timew;LAGGED = urge(1);MISSING = ALL(999);TINTERVAL = timeqd(0.08);
DEFINE: female = gender - 1;age = (age-44.3)/10.1;timew = timeqd/100-1.305;timet = timew;CENTER negaff(GROUPMEAN subject);
ANALYSIS: TYPE = CROSS RANDOM;ESTIMATOR = BAYES;PROCESSORS = 2;BITERATIONS = (1000);
Bengt Muthen DSEM 72/ 96
Input Continued
MODEL: %WITHIN%urge ON urge&1;syx | urge ON negaff;s | urge ON timew;negaff;timew;%BETWEEN subject%urge syx s ON female age;urge syx s WITH urge syx s;quit ON urge syx s female age;s; [s];%BETWEEN timeqd%syx ON timet;urge WITH syx;s@0;
OUTPUT: TECH1 TECH8;PLOT: TYPE = PLOT3;
FACTORS = ALL;
Run time: 34 minutesBengt Muthen DSEM 73/ 96
Trend in Slope for Urge Regressed on Negative Affect
%BETWEEN timeqd%syx ON timet; ! the estimate is significant negative
0
10
20
30
40
50
60
70
80
90
100
110
120
130
140
150
160
170
180
190
200
210
220
230
240
250
260
270
280
290
300
Time
-0.6
-0.5
-0.4
-0.3
-0.2
-0.1
0
0.1
0.2
0.3
0.4
0.5
0.6
SY
X%
2a
, m
ea
n
The effect of negative affect on smoking urge is reduced over time
syx%2a is the between-time factor score part of the syx slope (its meanis zero) - the full syx slope mean is expressed as on the next slide
Bengt Muthen DSEM 74/ 96
Estimated syx Slope Mean At Different Time Points:A Time-Varying Effect (TVEM)
E(syx|timet, female,age) = 0.474 −0.156∗ timet︸ ︷︷ ︸from the between time level
+0.072∗ female+0.000∗age︸ ︷︷ ︸from the between subject level
DEFINE: timet = timeqd/100-1.305; with range -1.2 to +1.2
timet = -1, males: E(syx|∗) = 0.63timet = 0, males: E(syx|∗) = 0.47timet = 1, males: E(syx|∗) = 0.32
Bengt Muthen DSEM 75/ 96
Outline
Introduction to Bayesian analysis
Introduction to longitudinal analysis
N=1 time series analysis
Two-level time series analysis
Cross-classified time series analysis
Latent variable time series analysis
Bengt Muthen DSEM 76/ 96
Time Series Analysis with Latent Variables:Latent Variables on the Within Level
So far we have focused on latent variables on the between levelin the form of random effects
Although on Within we have used the latent variable within-leveldecomposition of the outcome, centering by ybi:
ywit = yit− ybi
Now we introduce within-level factors:Factors defined by single indicators with measurement errorResidual factors in ARMA(1,1)Factors defined by multiple indicatorsTwo-level and Cross-classified analysis
Categorical latent variables (version 8.x, although an SEMarticle is already online; Asparouhov, Hamaker, Muthen, 2017):
Transition modeling (Hidden Markov, regime switching,time-series LTA) with latent class variablesGrowth mixture modeling
Bengt Muthen DSEM 77/ 96
Single-Indicator Measurement Error Model
yt-1 yt
ft-1 ft
y s
Within
Between
Two types of errors:
Dynamic errors carry overfrom occasion to occasion(unobserved influences)Measurement errors don’tcarry over (making errorsanswering; white noise)
The model is identified unlikeregular factor analysis due toauto-regressive feature (likepanel data modeling a la Werts,Linn & Joreskog, 1977)
Schuurman et al. (2015)Frontiers of Psych; N = 1
Bengt Muthen DSEM 78/ 96
Input for Measurement Error Model
TITLE: this is an example of a two-level time series analysis with a first-order autoregressive AR(1) factor analysis model for a singlecontinuous indicator and measurement error
DATA: FILE = ex9.33.dat;VARIABLE: NAMES = y subject;
CLUSTER = subject;ANALYSIS: TYPE = TWOLEVEL RANDOM;
ESTIMATOR = BAYES;PROCESSORS = 2;BITERATIONS = (5000);
MODEL: %WITHIN%f BY y@1(&1);s | f ON f&1;%BETWEEN%y WITH s;
OUTPUT: TECH1 TECH8;PLOT: TYPE = PLOT3;
Bengt Muthen DSEM 79/ 96
A Related Model: ARMA(1,1)
yt-1 yt
et-1 et
y s
Within
Between
MODEL: %WITHIN%s | y ON y&1;e BY y@1 (&1);[email protected];y ON e&1;
AR stands for autoregressive and MA stands for moving average (Shumway &Stoffer, 2011)
Bengt Muthen DSEM 80/ 96
Thoughts on Measurement Error versus ARMA(1,1)
Granger and Morris (1976) and Schuurman et al. (2015) show that for N = 1time series analysis, ARMA (1, 1) is an alternative representation of the dataused in the measurement error model; formulas show translation of parameters
In the Mplus implementation the measurement error formulationconverges more smoothly than ARMA(1,1)The N = 1 versions of these models require a large T, say T > 100Preliminary simulations indicate that the N > 1 versions havegood performance at T = 50, reasonable performance at T = 25,and maybe acceptable performance at T = 14: Suitable for dailydiary designs
AR models assume exponential decays in autocorrelation - the measurementerror model allows a slower, more realistic decay (Asparouhov, 2017)
A preliminary observation: it appears to be difficult to add random variance tothe factor in the measurement error model
Research questions: How does performance compare to having multipleindicators (e.g. 10 NA items)? Is random variance easier there?
Bengt Muthen DSEM 81/ 96
Two-Level Factor Analysis: UG ex9.34
y1 y2 y3 y4
ft-1 ft
t t t t
Within
Between
y1 y2 y3 y4
s fb logv
Random intercepts becomelatent factor indicators onBetween
The figure shows a DAFS (directautoregressive factor score)model on Within
An alternative is the WNFS(white noise factor score) modelwhich uses y1-y4 ON f&1instead of f ON f&1
A combination model is alsoidentified (may need large T)
N = 1 factor analysis: Engle &Watson (1981) in JASA,Molenaar (1985) inPsychometrika
Bengt Muthen DSEM 82/ 96
Cross-Classified Factor Analysis: UG ex9.40
ft-1 ft
Within
Between subject
y1
Between time
t
fsubj
y1 y2 y3
ftime
y2 y3t t
y1 y2 y3
Bengt Muthen DSEM 83/ 96
Measurement Non-Invariance Across SubjectsUsing Two-Level Factor Analysis: Random Intercepts
For a certain item measured for individual i at time t, two-level factoranalysis (see, e.g., Muthen, 1994) considers
yit = ν +λb fbi + εbi︸ ︷︷ ︸Between
+λw fwit + εwit︸ ︷︷ ︸Within
. (4)
This can be re-expressed as
Level 1 : yit = νi +λw fwit + εwit, (5)Level 2 : νi = ν +λb fbi + εbi, (6)
which is a random intercept model, that is, there is measurementnon-invariance across subjects wrt the intercepts (Jak et al., 2013, 2014;Muthen & Asparouhov, 2017). IRT typically uses λw = λb, εbi = 0,
Level 1 : yit = ν +λ fit + εwit, (7)Level 2 : fit = fbi + fwit, (8)
- that is, no non-invariance and one single factor dimension
Bengt Muthen DSEM 84/ 96
Measurement Non-Invariance of Intercepts and LoadingsAcross Subjects and Time
Mplus Version 8 offers:
Two-level analysis:
Random intercepts varying across subjectsRandom loadings varying across subjects: s1 - s10 | f BY y1 - y10Asparouhov & Muthen (2015), Fox (2010)
Cross-classified analysis:
Random intercepts varying across subjects and timeRandom loadings varying across subjectsVersion 7.4 had cross-classified analysis with random interceptsand loadings but not auto-correlation needed for ILDFor an example of random loadings varying across subjects, seethe Mplus Version 8 User’s Guide ex9.40 part2
Bengt Muthen DSEM 85/ 96
Subject-Specific Reliability
The two-level and cross-classified factor analysis models imply
Measurement intercept and loadings possibly varying acrosssubject and timeFactor variances and residual variances varying across subjectand time
This implies that reliabilities of test scores (based on a set of items) vary acrosssubject and time
Hu, Nesselroade et al. (2016). Test reliability at the individuallevel. Structural Equation Modeling.
But why not instead look at the precision with which the factor scores can beestimated?
Mplus Version 8 Monte Carlo simulations give correlationsbetween true scores and estimated scores
Bengt Muthen DSEM 86/ 96
Example: Item Factor Analysis (IRT)Using 10 Negative Affect Items
Data from the older cohort of the Notre Dame Study of Health &Well-being (Bergeman): N = 270, T = 56 (daily measures onconsecutive days)
Wang, Hamaker, Bergeman (2012). Investigating inter-individualdifferences in short-term intra-individual variability. PsychologicalMethods
Predictors and distal outcomes of negative affect development over the56 days
10 NA items (5-cat scale): afraid, ashamed, guilty, hostile, scared,upset, irritable, jittery, nervous, distressed (average score used inarticle). Wide format would have 56*10 variables
Question format: Today I felt... (1 = Not at all, ..., 5 = Extremely)
1-factor DAFS model
Bengt Muthen DSEM 87/ 96
Negative Affect Distributions of NA in Bergeman Data
Average score (55% at floor value of 1 - Not at all for all 10 items):
1.1
1.3
1.5
1.7
1.9
2.1
2.3
2.5
2.7
2.9
3.1
3.3
3.5
3.7
3.9
4.1
4.3
4.5
4.7
4.9
NEGATIVE AFFECT
0
500
1000
1500
2000
2500
3000
3500
4000
4500
5000
5500
6000
6500
7000
7500
8000
8500
9000
9500
10000
Count
Typical item distribution (66% at lowest value - Not at all):
1.1
1.3
1.5
1.7
1.9
2.1
2.3
2.5
2.7
2.9
3.1
3.3
3.5
3.7
3.9
4.1
4.3
4.5
4.7
4.9
IRRITABLE
0
2000
4000
6000
8000
10000
12000
14000
16000
18000
20000
22000
24000
26000
Count
Bengt Muthen DSEM 88/ 96
Ordered Categorical Item Modeling:Proportional Odds Model (Graded Response Model)
y*
y
Despite non-normal y, we can have normality of:The latent response variable y*Any factors in the modelThe between-level random effects
Bengt Muthen DSEM 89/ 96
Mplus Input for Cross-Classified Factor Analysiswith One Factor for 10 Ordinal NA Items
TITLE: Bergeman twolevelDATA: FILE = bergeman.csv;VARIABLE: NAMES = subject gender age hosp1 chrhlth1 Somhlth1 slfhlth1
psqi neo day afraid1 unhappy1 annoyd1 ashmd1 guilty1 an-gry1 sad1 hostile1 scared1 upset1 irrtbl1 deprsd1 jttry1 drowsy1slugish1 worrid1 nervs1 lonely1 fatiged1 distrsd1 nPANAS1;USEVARIABLES = afraid1 scared1 nervs1 jttry1 guilty1ashmd1 irrtbl1 hostile1 upset1 distrsd1;CATEGORICAL = afraid1-distrsd1;CLUSTER = subject day;MISSING = all(999);TINTERVAL = day(1);
ANALYSIS: TYPE = CROSSCLASSIFIED RANDOM;ESTIMATOR = BAYES;PROCESSORS = 2;BITERATIONS = (5000);THIN = 10;
Bengt Muthen DSEM 90/ 96
Mplus Input for Cross-Classified Factor Analysis Continued
IRT-style loading equality, setting the factor metric on the subject level
MODEL: %WITHIN%na w BY afraid1-distrsd1* (&1 1-10);na w ON na w&1;%BETWEEN SUBJECT%na subj BY afraid1-distrsd1* (1-10);na subj@1;%BETWEEN DAY%na time BY afraid1-distrsd1* (1-10);
OUTPUT: TECH1 TECH8 STDY STDYX TECH4 RESIDUALFSCOMPARISON;
PLOT: TYPE = PLOT3;FACTORS = ALL;
Run time: 54 minutes (dichotomized: 34 minutes; continuous: 16minutes)
Bengt Muthen DSEM 91/ 96
Results of Cross-Classified Factor Analysiswith One NA Factor for 10 Ordinal Items
na factorit = α +αi︸ ︷︷ ︸Between subject
+ αt︸︷︷︸Between time
+β yw,it−1 + εit︸ ︷︷ ︸Within subject
. (9)
V(na subject) = 1.00, V(na time) = 0.012, V(na w) = 0.66
The factor score plot for the na time factor (on the between day level)shows a drop of 40% of the total factor SD over the 56 days:
0
2
4
6
8
10
12
14
16
18
20
22
24
26
28
30
32
34
36
38
40
42
44
46
48
50
52
54
56
58
60
DAY
-0.27
-0.22
-0.17
-0.12
-0.07
-0.02
0.03
0.08
0.13
0.18
0.23
NA
TIM
E, m
ean
Bengt Muthen DSEM 92/ 96
Posterior Distributions for the Factor Scoreson Within, Between Subject, and Between Time
-1.7
2307
-1.4
5738
-1.1
917
-0.9
26012
-0.6
60327
-0.3
94642
-0.1
28958
0.1
36727
0.4
02412
0.6
68097
0.9
33782
1.1
9947
1.4
6515
1.7
3084
1.9
9652
2.2
6221
2.5
2789
2.7
9358
3.0
5926
3.3
2495
NA_W, mean
0
200
400
600
800
1000
1200
1400
1600
1800
2000
2200
2400
2600
2800
3000
3200
3400
3600
Count
-1.9
68
84
-1.6
83
75
-1.3
98
65
-1.1
13
55
-0.8
28
45
6
-0.5
43
35
9
-0.2
58
26
2
0.0
26
83
45
0.3
11
93
1
0.5
97
02
8
0.8
82
12
5
1.1
67
22
1.4
52
32
1.7
37
42
2.0
22
51
2.3
07
61
2.5
92
71
2.8
77
8
3.1
62
9
3.4
48
NA_SUBJ, mean
0
5
10
15
20
25
30
35
40
45
50
Count
-0.2
49
86
5
-0.2
22
37
6
-0.1
94
88
7
-0.1
67
39
8
-0.1
39
91
-0.1
12
42
1
-0.0
84
93
22
-0.0
57
44
34
-0.0
29
95
47
-0.0
02
46
59
5
0.0
25
02
28
0.0
52
51
15
0.0
80
00
03
0.1
07
48
9
0.1
34
97
8
0.1
62
46
7
0.1
89
95
5
0.2
17
44
4
0.2
44
93
3
0.2
72
42
1
NA_TIME, mean
0
200
400
600
800
1000
1200
1400
1600
1800
2000
2200
2400
2600
Count
Bengt Muthen DSEM 93/ 96
DSEM Web Page
Feel free to submit your papers to be posted here:
http://www.statmodel.com/TimeSeries.shtml
Bengt Muthen DSEM 94/ 96
References
Asparouhov, T. & Muthen, B. (2015). General random effect latent variable modeling: Random subjects, items, contexts, andparameters. In Harring, J. R., Stapleton, L. M., & Beretvas, S. N. (Eds.), Advances in multilevel modeling foreducational research: Addressing practical issues found in real-world applications. Charlotte, NC: Information AgePublishing, Inc.
Asparouhov, T., Hamaker, E.L. & Muthen, B. (2017). Dynamic structural equation models. Technical Report.www.statmodel.com
Asparouhov, T., Hamaker, E.L. & Muthen, B. (2017). Dynamic latent class analysis. Structural Equation Modeling, 24,257-269.
Bolger, N. & Laurenceau, J-P. (2013). Intensive longitudinal methods: An introduction to diary and experience samplingresearch. New York: Guilford.
Engle, R. & Watson, M. (1981). A one-factor multivariate time series model of metropolitan wage rates. JASA, 76, 774-781.
Fox, J.P. (2010). Bayesian item response theory. Springer.
Gelman, A., Carlin, J.B., Stern, H.S., & Rubin, D.B. (2014). Bayesian data analysis. Third edition. New York: Chapman &Hall.
Granger, C.W.J. & Morris, M.J. (1976). Time series modelling and interpretation. Journal of the Royal Statistical Society,Series A, 139, 246-257.
Hamaker, E.L. (2005). Conditions for the equivalence of the autoregressive latent trajectory model and a latent growth curvemodel with autoregressive disturbances. Sociological Methods & Research.
Hamaker, E.L.,Schuurman, N.K., & Ziljmans, E.A.O. (2017). Using a few snapshots to distinguish mountains from waves:Weak factorial invariance in the context of trait-state research. Multivariate Behavioral Research.
Hamaker, E.L. & Wichers, M. (2017). No time like the present: Discovering the hidden dynamics in intensive longitudinal data.Current Directions in Psychological Science.
Hu, Nesselroade et al. (2016). Test reliability at the individual level. Structural Equation Modeling.
Bengt Muthen DSEM 95/ 96
References Continued
Jak, S., Oort, F.J., & Dolan, C.V. (2013). A test for cluster bias: Detecting violations of measurement invariance across clustersin multilevel data. Structural Equation Modeling, 20, 265-282.
Jak, S., Oort, F.J., & Dolan, C.V. (2014). Measurement Bias in Multilevel Data. Structural Equation Modeling, 21:1, 31-39,DOI: 10.1080/10705511.2014.856694
Lynch, S. M. (2010). Introduction to applied Bayesian statistics and estimation for social scientists. Springer.
Molenaar, P. (1985). A dynamic factor model for the analysis of multivariate time series. Psychometrika, 50, 181-202.
Muthen, B. (1994). Multilevel covariance structure analysis. In J. Hox & I. Kreft (eds.), Multilevel Modeling, a special issue ofSociological Methods & Research, 22, 376-398.
Muthen, B. & Asparouhov, T. (2017). Recent methods for the study of measurement invariance with many groups. Alignmentand random effects. Forthcoming in a special measurement invariance issue of Sociological Methods & Research.
Muthen, B., Muthen, L, & Asparouhov, T. (2016). Regression and mediation analysis using Mplus. wwww.statmodel.com.
Schuurman, N.K., Houtveen, J.H., & Hamaker, E.L. (2015). Incorporating measurement error in n=1 psychologicalautoregressive modeling. Frontiers in Psychology, 6, 1-15.
Schuurman, N.K., Ferrer, E., de Boer-Sonnenschein, M., & Hamaker, E.L. (2016). How to compute cross-lagged associations ina multilevel autoregressive model. Psychological Methods, 21, 206-221.
Shiyko et al. (2012). Using the time-varying effect model (TVEM) to examine dynamic associations between negative affectand self confidence on smoking urges. Prevention Science, 13, 288-299
Shumway, R.H. & Stoffer, D.S. (2011). Time series analysis and its applications. New York: Springer.
Wall, M.M., Guo, J., & Amemiya, Y. (2012). Mixture factor analysis for approximating a nonnormally distributed continuouslatent factor with continuous and dichotomous observed variables. Multivariate Behavioral Research, 47, 276-313.
Walls, T.D. & Schafer, J.L (2006). Models for intensive longitudinal data. Oxford Univ Press.
Wang, Hamaker, & Bergeman (2012). Investigating inter-individual differences in short-term intra-individual variability.Psychological Methods.
Bengt Muthen DSEM 96/ 96