Strategic Emergency Preparedness Training, Anal ysis and ...
APPLIED VIEW YSIS - Texas Tech University€¦ · APPLIED YSIS: Ph.D. University...
Transcript of APPLIED VIEW YSIS - Texas Tech University€¦ · APPLIED YSIS: Ph.D. University...
APPLIED LATENT CLASS ANALYSIS:
A WORKSHOP
Katherine Masyn, Ph.D.Harvard University
December 5, 2013Texas Tech University
Lubbock, TX
OVERVIEW
© Masyn (2013) LCA Workshop- 2 -
Statistical Modeling in the Mplus Framework 3The Finite Mixture Model Family 11Latent Class Analysis (LCA) 16LCA Example: LSAY 27LCA Model Building 36Direct and Indirect Applications 38Model Estimation 42Class Enumeration 55Fit Indices 60Classification Quality 70Summing It Up 79Latent Class Regression (LCR) 89“1-STEP” APPROACH for Latent Class Predictors 102“OLD” 3-STEP APPROACH for Latent Class Predictors 105NEW 3-STEP APPROACH for Latent Class Predictors 107Distal Outcomes 114Modeling Extensions 123Longitudinal Mixture Models 132Parting Words 143Questions? 151Select References & Resources 153
STATISTICAL MODELING IN THEMPLUS FRAMEWORK
© Masyn (2013) LCA Workshop- 3 -
MODEL DIAGRAMS
Boxes for observed measures
Circles for latent variables
Arrow for “causal”/directional relationship
Arrow for “noncausal” relationship
Arrow, not originating from box or circle, for residual or “unique” variance
© Masyn (2013) LCA Workshop- 4 -
MPLUS MODELING FRAMEWORK
c
y
u
x T
© Muthén & Muthén (2013)© Masyn (2013) LCA Workshop- 5 -
c
y
u
x
= continuous latent variable; c = categorical latent variabley = continuous observed variable; u = discrete observed variableT =continuous event time; x = observed continuous/categorical covariate
WITHIN
BETWEENFrom: Muthén & Muthén, 1998-2013
T
STATISTICAL CONCEPTS CAPTUREDBY LATENT VARIABLES
Continuous LVs• Measurement errors• Factors• Random effects• Frailties, liabilities• Variance components• Missing data
Categorical LVs• Latent classes• Clusters• Finite mixtures• Missing data
© Muthén & Muthén (2013)© Masyn (2013) LCA Workshop- 7 -
STATISTICAL MODELSUSING LATENT VARIABLES
Continuous LVs• Factor analysis; IRT• Structural equation
models • Growth models• Multilevel models• Missing data models
Categorical LVs• Latent class analysis• Finite mixture models• Discrete-time survival
analysis• Missing data models
© Muthén & Muthén (2013)
Mplus integrates the statistical concepts captured by latent variables into a general modeling framework that includes not only
all of the models listed above but also combinations and extensions of these models.
© Masyn (2013) LCA Workshop- 8 -
MPLUS BACKGROUND
• Inefficient dissemination of statistical methods:– Many good methods contributions from biostatistics, psychometrics, etc.
are underutilized in practice• Fragmented presentation of methods:
– Technical descriptions in many different journals– Many different pieces of limited software
• Mplus: Integration of methods in one framework– Easy to use: Simple, non-technical language, graphics– Powerful: General modeling capabilities
• Mplus versionsV1: November 1998 V2: February 2001V3: March 2004 V4: February 2006V5: November 2007 V5.2: November 2008V6: April 2010 V6.12: November 2011V7: September 2012 V7.1: May 2013
• Mplus team: Linda & Bengt Muthén, Thuy Nguyen, Tihomir Asparouhov, Michelle Conn, Jean Maninger
© Muthén & Muthén (2013)© Masyn (2013) LCA Workshop- 9 -
MPLUS V7.1*
(WWW.STATMODEL.COM)Several programs in one
– Exploratory factor analysis– Structural equation modeling– Item response theory analysis– Latent class analysis– Latent transition analysis– Mediation analysis– Survival analysis– Growth modeling– Multilevel analysis– Complex survey data analysis– Monte Carlo simulation– Bayesian analysis– Multiple imputation
Fully integrated in the general latent variable framework* Released in May 2013
© Muthén & Muthén (2013)© Masyn (2013) LCA Workshop- 10 -
THE FINITE MIXTURE MODELFAMILY
11© Masyn (2013) LCA Workshop
FAMILY MEMBERS
The finite mixture model family includes:• Cross-sectional:
– Latent class analysis (LCA)– Latent profile analysis (LPA)– Latent class cluster analysis (LCCA)– Regression mixture models– Factor mixture models (FMM)– Etc.
• Longitudinal:– Growth mixture models (GMM)– Latent transition models (LTA)– Survival mixture analysis (SMA)– Etc.
12© Masyn (2013) LCA Workshop
LATENT CLASS ANALYSIS –CATEGORICAL LV AND CATEGORICAL MVS
c
y
u
x T
13© Masyn (2013) LCA Workshop
LATENT PROFILE ANALYSIS/LATENT CLASSCLUSTER ANALYSIS – CATEGORICAL LV AND
CONTINUOUS MVS
c
y
u
x T
14© Masyn (2013) LCA Workshop
FINITE MIXTURE MODEL LIKELIHOOD
• The basic finite mixture model has the following likelihood function:
• K is the number of latent classes• is the proportion of the total population
belonging to Class k.• is the class-specific density function for
the latent class indicator (manifest) variables with class-specific parameters, .
15© Masyn (2013) LCA Workshop
LATENT CLASS ANALYSIS (LCA)
© Masyn (2013) 16 LCA Workshop
17
c
u1 u2 u3 u4
© Masyn (2013) LCA Workshop 18
• Categorical indicators• Categorical latent variable• Cross-sectional data• Some consider LCA the categorical analogue
to factor analysis. • Sometimes referred to as person-centered
analysis to stand in contrast to variable-centered analysis such as CFA.
• Different from IRT that models categorical variables as indicators of an underlying continuous trait (ability).
TRADITIONAL LCA
© Masyn (2013) LCA Workshop
19
• Binary test items as multiple indicators for an underlying 2-level categorical latent variable representing profiles of Mastery and Non-mastery.
• DSM-VI symptom checklist (diagnostic criteria) for depression.
FOR EXAMPLE
© Masyn (2013) LCA Workshop
StudentItem 1 Item 2 Item 3 Item 4
1 1 1 1 12 0 0 0 03 1 0 1 04 1 0 0 05 0 0 1 06 1 1 1 07 1 1 1 0
EXAMPLE DATA
20© Masyn (2013) LCA Workshop
21
• Create a cut-point based on the sum score, e.g., clinical depression if satisfying 5 or more of the 9 symptoms; mastery defined as 80% of items correctly answered.
• Problems– Treats all items the same, e.g., doesn’t take
into account that some items may be more “difficult” than others
– Doesn’t take into account measurement error, e.g., some with Mastery status may still make a careless error.
NAÏVE APPROACH
© Masyn (2013) LCA Workshop 22
• Characterizes groups of individuals based on response patterns for multiple indicators.
• Class membership “explains” observed covariation between indicators.
• Allows for measurement error in that class-specific item probabilities may be between zero and one.
• Allows comparisons of indicator sensitivity and specificity to identify items that best differentiate the classes
• Estimates the prevalence of each class in the population
• Enables stochastic classification of individuals into classes
LCA APPROACH
© Masyn (2013) LCA Workshop
ITEM PROBABILITY PLOTS
23© Masyn (2013) LCA Workshop
MEASUREMENT CHARACTERISTICS
• Class homogeneity – Individuals within a given class are similar to each other with respect to item responses, e.g., for binary items, class-specific response probabilities above .70 or below .30 indicate high homogeneity.
• Class separation – Individual across two classes are dissimilar with respect to item responses, e.g., for binary items, odds ratios (ORs) of item endorsements between two classes >5 or <.2 indicate high separation.
24© Masyn (2013) LCA Workshop
ITEM PROBABILITY PLOTS
25© Masyn (2013) LCA Workshop 26
ItemClass
1 (70%)
Class 2
(20%)
Class 3
(10%)
Class 1 vs. 2
Class 1 vs. 3
Class 2 vs. 3
u1 .90* .10 .90 81.00**1.00 0.01
u2 .80 .20 .90 16.00 0.44 0.03
u3 .90 .40 .50 13.50 9.00 0.67
u4 .80 .10 .20 36.00 16.00 0.44
u5 .60 .50 .40 1.50 2.25 1.50
* Item probabilities >.7 or <.3 are bolded to indicate a high degree of class homogeneity.** Odds ratios >5 or <.2 are bolded to indicate a high degree of class separation.
© Masyn (2013) LCA Workshop
LCA EXAMPLE: LSAY
© Masyn (2013) 27 LCA Workshop
EXAMPLE: LONGITUDINAL STUDY OFAMERICAN YOUTH (LSAY)
• A national longitudinal study funded by the National Science Foundation(NSF)
• Designed to investigate the development of students learning and achievement, particularly related to math, science, and technology and to examine the relationship of those student outcomes across middle and high school to post-secondary education and early career choices.
• More information can be found out http://lsay.org/index.html
28© Masyn (2013) LCA Workshop
LCA EXAMPLE: LSAY
• Research Aim:– Characterize population heterogeneity in math
attitudes (manifest in 9 survey items) using latent classes of math dispositions.
• Why not state research questions like:– Are there different profiles of math
dispositions based on the math attitude items?
– How many profiles are there?– What are the profiles?
29© Masyn (2013) LCA Workshop 30
Survey Prompt:“Now we would like you to tell us how you feel about math and science. Please indicate for you feel about each of the following statements.”
Total sample (nT = 2675)
f rf
1) I enjoy math. 1784 .67
2) I am good at math. 1850 .69
3) I usually understand what we are doing in math. 2020 .76
4) Doing math often makes me nervous or upset. 1546 .59
5) I often get scared when I open my math book see a page of problems.
1821 .69
6) Math is useful in everyday problems. 1835 .70
7) Math helps a person think logically. 1686 .64
8) It is important to know math to get a good job. 1947 .74
9) I will use math in many ways as an adult. 1858 .70
© Masyn (2013) LCA Workshop
Usevariables = ca28ar ca28br ca28cr ca28er ca28gr ca28hr ca28ir ca28kr ca28lr;
CATEGORICAL = ca28ar ca28br ca28cr ca28er ca28gr ca28hr ca28ir ca28kr ca28lr;
missing=all(9999);classes= c(5);
Analysis:type=mixture;starts=500 100;processors=4;
Model:Next slide
© Masyn (2013) LCA Workshop- 31 -
Model:%overall%[ ca28ar$1 ca28br$1 ca28cr$1 ca28er$1 ca28gr$1 ca28hr$1 ca28ir$1 ca28kr$1 ca28lr$1 ];
%c#1%[ ca28ar$1 ca28br$1 ca28cr$1 ca28er$1 ca28gr$1 ca28hr$1 ca28ir$1 ca28kr$1 ca28lr$1 ];
%c#2%[ ca28ar$1 ca28br$1 ca28cr$1 ca28er$1 ca28gr$1 ca28hr$1 ca28ir$1 ca28kr$1 ca28lr$1 ];
.
.
.%c#5%[ ca28ar$1 ca28br$1 ca28cr$1 ca28er$1 ca28gr$1 ca28hr$1 ca28ir$1 ca28kr$1 ca28lr$1 ];
32© Masyn (2013) LCA Workshop
Note: With categorical indicators, the
following model statement would
produce the same result!
Model:
LCA EXAMPLE: LSAY
1-Pro-math without anxiety; 2-Pro-math with anxiety; 3- Math Lover; 4- I don’t like math but I know it’s good for me; 5- Anti-Math with anxiety
LCA EXAMPLE: LSAY
34
Two-TailedEstimate S.E. Est./S.E. P-Value
Latent Class 1Thresholds
CA28AR$1 -2.122 0.185 -11.442 0.000CA28BR$1 -2.539 0.242 -10.514 0.000CA28CR$1 -3.081 0.291 -10.577 0.000CA28ER$1 -1.791 0.371 -4.825 0.000CA28GR$1 -15.000 0.000 999.000 999.000CA28HR$1 -2.498 0.262 -9.533 0.000CA28IR$1 -1.839 0.188 -9.781 0.000CA28KR$1 -2.876 0.324 -8.866 0.000CA28LR$1 -2.723 0.310 -8.775 0.000
RESULTS IN PROBABILITY SCALELatent Class 1CA28AR
Category 1 0.107 0.018 6.039 0.000Category 2 0.893 0.018 50.392 0.000
© Masyn (2013) LCA Workshop
LCA EXAMPLE: LSAY
35
FINAL CLASS COUNTS AND PROPORTIONS FOR THE LATENT CLASSESBASED ON THE ESTIMATED MODEL
LatentClasses
1 525.13598 0.392482 173.96909 0.130023 244.13155 0.182464 254.57820 0.190275 140.18517 0.10477
© Masyn (2013) LCA Workshop
LCA MODEL BUILDING
36© Masyn (2013) LCA Workshop
MIXTURE MODEL BUILDING STEPS
1. Data screening and descriptives.2. Class enumeration process.3. Select final unconditional model (this is
your measurement model).4. Add potential predictors (and check for
measurement invariance).5. Add potential distal outcomes.
37© Masyn (2013) LCA Workshop
DIRECT AND INDIRECTAPPLICATIONS
© Masyn (2013) 38 LCA Workshop
DIRECT VS. INDIRECT APPLICATION
y
c
Is the “Truth” a heterogeneous population composed of a mixture of two normally-distributed homogeneous subpopulations?Is the “Truth” a single, non-normally-distributed homogeneous population?
© Masyn (2013) 39 LCA Workshop
DIRECT APPLICATIONS OF MIXTUREMODELING
• Mixture models are used with the a priori assumption that the overall population is heterogeneous, and made up of a finite number of (latent and substantively meaningful) homogeneous groups or subpopulations, usually specified to have tractable distributions of indicators within groups, such as a multivariate normal distribution.
© Masyn (2013) 40 LCA Workshop
INDIRECT APPLICATIONS OF MIXTUREMODELING
• It is assumed that the overall population is homogeneous and finite mixtures are simply used as more tractable, semi-parametric technique for modeling a population of outcomes for which it may not be possible (practically- or analytically-speaking) to specify a parametric model.
• The focus for indirect applications is then not on the resultant mixture components nor their interpretation, but rather on the overall population distribution approximated by the mixing.
© Masyn (2013) 41 LCA Workshop
MODEL ESTIMATION
© Masyn (2013) 42 LCA Workshop
ML ESTIMATION FOR LCA
• c is treated as missing data under MAR.
• MAR assumes that the probabilities of values being missing are independent of the missing values conditional on those values that are observed (both u and x). (Little and Rubin, 2002)
© Masyn (2013) 43 LCA Workshop
• Basic principle of ML: Choose estimates of the model parameters whose values, if true, would maximize the probability of observing what had, in fact, been observed.
• This requires an expression that describes the distribution of the data as a function of the unknown parameters, i.e., the likelihood function.
© Masyn (2013) 44 LCA Workshop
• Under MAR, the ML estimates for the complete data may be obtained by maximizing the likelihood function summed over all possible values of the missing data, i.e., integrate out the missingness.
• Often, this integrated likelihood cannot be maximized analytically and requires an iterative estimation procedure, e.g., EM.
© Masyn (2013) 45 LCA Workshop
THE EM ALGORITHM
• How does it work?– Start with random split of people into classes. – Reclassify based on a improvement criterion– Reclassify until the “best” classification of
people is found.• The EM algorithm is a missing data
technique. In this application, the latent class variable is the missing data– and it happens to be missing for the entire data set.
© Masyn (2013) 46 LCA Workshop
ML ESTIMATION VIA EM ALGORITHM
E(xpectation) step: c is treated as missing data. Missing values ci are replaced by the conditional means of cigiven the yi’s. These means are the posterior probabilities for each class.
M(aximization) step: New estimates of the parameters are obtained from the maximization based on the estimated complete-data. Pr(yj|c=k) and Pr(c=k) parameters are estimated by regression and summation over the posterior probabilities.
• Missing data is allowed on the y’s as well, assuming MAR.
• Standard errors are obtained using some approximation to the Fisher information matrix. (In Mplus, “ML” default for no missing data on the y’s; “MLR” for missing data on indicators).
© Masyn (2013) 47 LCA Workshop
THE CHALLENGES OF ML VIA EM• MLE for mixture models can present
statistical and numeric challenges that must be addressed during the application of mixture modeling:– The estimation may fail to converge even if
the model is theoretically identified.– If the estimation algorithm does converge,
since the log likelihood surface for mixtures is often multimodal, there is no way to prove the solution is a global rather than local maximum.
© Masyn (2013) 48 LCA Workshop
© Masyn (2013) 49 LCA Workshop © Masyn (2013) 50 LCA Workshop
How would you distinguish between these two cases?
MOST IMPORTANTLY: • Use multiple random sets of starting values with the
estimation algorithm—it is recommended that a minimum of 50 to 100 sets of extensively, randomly varied starting values are used (Hipp & Bauer, 2006) but more may be necessary to observe satisfactory replication of the best maximum log likelihood value.
• Recommendations for a more thorough investigation of multiple solutions when there are more than two classes:ANALYSIS: STARTS = 50 5;or with many classesANALYSIS: STARTS = 500 10
© Masyn (2013) 51 LCA Workshop
Note: LL replication is neither necessary or sufficient for a given solution to be the global maximum.
© Masyn (2013) 52 LCA Workshop
And keep track of the following information:• The number and proportion of sets of random starting values
that converge to proper solution (as failure to consistently converge can indicate weak identification);
• The number and proportion of replicated maximum likelihood values for each local and the apparent global solution (as a high frequency of replication of the apparent global solution across the sets of random starting values increases confidence that the “best” solution found is the true maximum likelihood solution);
• The condition number. It is computed as the ratio of the smallest to largest eigenvalue of the information matrix estimate based on the maximum likelihood solution. A low condition number, less than 10-6, may indicate singularity (or near singularity) of the information matrix and, hence, model non-identification (or empirical underidentification)
• The smallest estimated class proportion and estimated class size among all the latent classes estimated in the model (as a class proportion near zero can be a sign of class collapsing and class over-extraction).
© Masyn (2013) 53 LCA Workshop
• This information, when examined collectively, will assist in tagging models that are non-identified or not well-identified and whose maximum likelihoods solutions, if obtained, are not likely to be stable or trustworthy. These not well-identified models should be discarded from further consideration or mindfully modified in such a way that the empirical issues surrounding the estimation for that particular model are resolved without compromising the theoretical integrity and substantive foundations of the analytic model.
© Masyn (2013) 54 LCA Workshop
CLASS ENUMERATION
© Masyn (2013) 55 LCA Workshop
NOW THE HARD PART • In the majority of applications of mixture
modeling, the number of classes is not known.
• Even in direct applications, when one assumes a priori that the population is heterogeneous, you rarely have specific hypotheses regarding the exact number or nature of the subpopulations.
• Thus, in either case (direct or indirect), you must begin with the model building with an exploratory class enumeration step.
© Masyn (2013) 56 LCA Workshop
• Deciding on the number of classes is often the most arduous phase of the mixture modeling process.
• It is labor intensive because it requires consideration (and, therefore, estimation) of a set of models with a varying numbers of classes
• It is complicated in that the selection of a “final” model from the set of models under consideration requires the examination of a host of fit indices along with substantive scrutiny and practical reflection, as there is no single method for comparing models with differing numbers of latent classes that is widely accepted as best.
© Masyn (2013) 57 LCA Workshop
EVALUATING THE MODEL
The statistical tools are divided into three categories: 1. evaluations of
absolute fit; 2. evaluations of
relative fit; 3. evaluations of
classification.
Model Usefulness• Substantive meaningful
and substantively distinct classes (face + content validity)
• Cross-validation in second sample (or split sample)
• Parsimony principle• Criterion-related validity
© Masyn (2013) 58 LCA Workshop
CLASS ENUMERATION PROCESS FOR LCA
• Fit models for K=1, 2, 3, increasing K until the models become not well-identified.
• Collect fit information on each model using a combination of statistical tools
• Decide on 1-2 “plausible” models • Apply broader set of statistical tools to set
of candidate models and evaluate the model usefulness.
© Masyn (2013) 59 LCA Workshop
FIT INDICES
© Masyn (2013) LCA Workshop- 60 -
ABSOLUTE FIT
• There is an overall likelihood ratio model chi-square goodness-of-fit for mixture measurement model with only categorical indicators (using similar formula to the goodness-of-fit chi-square for contingency table analyses and log linear models).
© Masyn (2013) 61 LCA Workshop
• “Inspection” = Look at standardized residuals evaluating difference between the observed response pattern frequencies the model-estimated frequencies.
© Masyn (2013) LCA Workshop- 62 -
RELATIVE FIT
1. Inferential: The most common ML-based inferential comparison is the likelihood ratio test (LRT) for nested models (e.g. K=3 vs. K=4 class model).
Hypothesis testing using the likelihood ratioH0: k classesH1: k+1 classes
LRTS = -2 [ log L(H0) - log L(H1) ]
When testing a k-class mixture model versus a (k+g)-class model, the LRTS does not have an asymptotic chi-squared distribution.
Why?Regularity conditions are not met: Mixing proportion of zero is on
the boundary of the parameter space and the parameters under the null model are not identifiable.
© Masyn (2013) 63 LCA Workshop
SOLUTIONS?• Analytically-derive distribution of LRTS adjusted
VLMR-LRT (Tech11 in Mplus)– Vuong (1989) derived a LRT for model selection based on
the Kullback & Leibler (1951) information criterion. Lo, Mendel, and Rubin (2001) extended Vuong’s theorem to cover the LRT for a k-class normal mixture versus a (k+g) class normal mixture.
• Empirically-derive distribution of LRTS(parametric) Bootstrap LRT (Tech14 in Mplus)
NOTE: For both Tech11 and Tech14, Mplus computes the LRT for your K-class model compared to a model with one less class (i.e., K-1 class model as the Null). Make sure the H0 loglikelihood value given in Tech11/Tech14 matches the best LL solution you obtained in your own K-1 class run.
© Masyn (2013) LCA Workshop- 64 -
2. Information-heuristic criteria: These indices weigh the fit of the model (as captured by the maximum log likelihood value) in consideration of the model complexity (recognizing that although one can always improve the fit of a model by adding parameters, there is a cost that improvement in fit to model parsimony).
• These information criteria can be expressed in the following form:
• Traditional penalty is a function of n and d; n= sample size, d= number of parameters
© Masyn (2013) 65 LCA Workshop
INFORMATION CRITERIA
• Bayesian Information Criterion
• Consistent Akaike’s Information Criterion
• Approximate Weight of Evidence Criterion
• For these ICs, lower values indicate a better model, relatively-speaking. Sometime, a minimum values if not reached and scree/”elbow” plots are utilized.
© Masyn (2013) 66 LCA Workshop
How much lower does an IC values have to be to mean the model is really better?• Bayes Factor: Which model, A or B, is
more likely to be the true model if one of the two is the true model?
© Masyn (2013) 67 LCA Workshop
• The approximate correct model probability (cmP) for a Model A is an approximation of the actual probability of Model A being the correct model relative to a set of J models under consideration .
© Masyn (2013) 68 LCA Workshop
© Masyn (2013) LCA Workshop- 69 -
CLASSIFICATION QUALITY
© Masyn (2013) LCA Workshop- 70 -
CLASSIFICATION QUALITY/CLASSSEPARATION
• A good mixture model in a direct application* should yield empirically, highly-differentiated, well-separated latent classes whose members have a high degree of homogeneity in their responses on the class indicators.
*A well-fitting mixture model can have very poor class separation Classificationquality is not a measure of model fit!
© Masyn (2013) 71 LCA Workshop
• Most all of the classification diagnostics are based on estimated posterior class probabilities.
• Posterior class probabilities are the model-estimated values for each individual’s probabilities of being in each of the latent classes based on the maximum likelihood parameter estimates and the individual’s observed responses on the indicator variables (similar estimated factor scores).
© Masyn (2013) 72 LCA Workshop
RELATIVE ENTROPY
• An index that summarizes the overall precision of classification for the whole sample across all the latent classes
• When posterior classification is no better than random guessing, E=0, and when there is perfect posterior classification for all individuals in the sample, E=1.
© Masyn (2013) LCA Workshop- 73 -
• Since even when E is close to 1.00 there can be a high degree of latent class assignment error for particular individuals, and since posterior classification uncertainty may increase simply by chance for models with more latent classes, E was never intended nor should it be used for model selection during the class enumeration process. (REMEMBER: A mixture model with low entropy could still fit the data well.)
• However, values near zero may indicate that the latent classes are not sufficiently well-separated for the classes that have been estimated. Thus, E may be used to identify problematic over-extraction of latent classes and may also be used to judge the utility of the latent class analysis directly applied to a particular set of indicators to produce empirically, highly-differentiated groups in the sample.
© Masyn (2013) LCA Workshop- 74 -
AVEPP• Average posterior class probability (AvePP),
enables evaluation of the classification uncertainty for each of the latent classes separately.
• The average posterior class probability for each class, k, among all individuals whose maximum posterior class probability is for Class k (i.e., individuals modally assigned to Class k).
• Nagin suggests AvePP values >.7 indicate adequate separation and classification precision.
© Masyn (2013) LCA Workshop- 75 -
OCC
• The denominator of the odds of correct classification (OCC) is the odds of correct classification based on random assignment using the model-estimated marginal class proportions.
• The numerator is the odds of correct classification based on the maximum posterior class probability assignment rule (i.e., modal class assignment).
• When the modal class assignment for Class k is no better than chance, then OCC(k)=0.
• As AvePP(k) gets close to one, OCC(k) gets large.• Nagin suggests OCC(k)>5 indicate adequate separation and
classification precision.
© Masyn (2013) LCA Workshop- 76 -
MCAP• Modal class assignment proportion (mcaP) is
the proportion of individuals in the sample modally-assigned to Class k.
• If individuals were assigned to Class k with perfect certainty, then mcaP(k) would be equal to the model-estimated Pr(c=k). Larger discrepancies are indicative of larger latent class assignment errors.
• To gauge the discrepancy, each mcaP can be compared to the to 95% confidence interval for the corresponding model-estimated Pr(c=k).
© Masyn (2013) LCA Workshop- 77 - © Masyn (2013) LCA Workshop- 78 -
SUMMING IT UP
© Masyn (2013) LCA Workshop- 79 - © Masyn (2013) LCA Workshop- 80 -
1)
2)
© Masyn (2013) LCA Workshop- 81 -
3)
© Masyn (2013) LCA Workshop- 82 -
4)
5)
© Masyn (2013) LCA Workshop- 83 -
5a)
© Masyn (2013) LCA Workshop- 84 -
5b)
© Masyn (2013) LCA Workshop- 85 -
5c)
© Masyn (2013) LCA Workshop- 86 -
5d)
5e)
6)
© Masyn (2013) LCA Workshop- 87 -
AND, FINALLY
• On the basis of all the comparisons made in Steps 5 and 6, select the final model in the class enumeration process. – Note: You may end up carrying forward two
candidate models into the conditional modeling stage.
• If you had a large enough sample to do a split-half cross-validation, now is when you would look at the validation sample.
© Masyn (2013) LCA Workshop- 88 -
7)
LATENT CLASS REGRESSION(LCR)
© Masyn (2013) LCA Workshop- 89 -
LATENT CLASS VALIDATION
• Link the conceptual/theoretical aspects of the latent class variable with observable variables
• “[To] make clear what something is” means to set forth the laws in which it occurs
• Cronbach & Meehl (1955) termed this process the nomological (or lawful) network
© Masyn (2013) 90 LCA Workshop
LINKAGES CRITERION-RELATED VALIDITY• In criteria-related validity (concurrent and
predictive), we check the performance of our latent classes against some criterion based on our theory of the construct represent by the latent class variable. – Concurrent: Latent class membership predicted
by or covarying with past or concurrent events (Latent class regression)
– Predictive: Latent class membership predicting future concrete events (Latent class w/ distal outcomes).
© Masyn (2013) 91 LCA Workshop
COVARIATES AND MIXTURE MODELS
u1 u2 u3 u4 u5
CRisk Factor
Indirect Effect
Direct effect
92© Masyn (2013) LCA Workshop
LATENT CLASS REGRESSION
• Like a MIMIC model in regular CFA/SEM• Categorical latent variable• Continuous or categorical covariates with
direct effects on y’s or indirect effects on y’s through c. – Indirect effects can also be thought of as
predictors of class membership. – Direct effects can also be thought of as
differential item functioning.
93© Masyn (2013) LCA Workshop
INCLUDING COVARIATES INTO LCA
• The inclusion of covariates into mixture models– Allow us to explore relationships of mixture
classes and auxiliary information.
– Understand how different classes relate to risk and protective factors
– Explore differences in demographics across the classes
© Masyn (2013) LCA Workshop94
“C ON X” = MULTINOMIAL REGRESSION
• Multinomial logistic regression is essentially simultaneous pairs of logistic regression of the odds in each outcome category versus a reference/baseline category.
• Mplus uses the last category/class as the baseline.
• So for K classes, we have K-1 logit equations.
© Masyn (2013) LCA Workshop95
• We model the following: Given membership in either Class k or K, what is the log odds that class membership is k (instead of K), given x? That is,
96© Masyn (2013) LCA Workshop
I enjoy Math I am good at math
I will use math later. . .
Male C
LSAY EXAMPLE
MODEL:%Overall% c on male;
97© Masyn (2013) LCA Workshop
LCA EXAMPLE: LSAY
1-Pro-math without anxiety, 2-Pro-math with anxiety, 3- Math Lover, 4- I don’t like math but I know it’s good for me, 5- Anti-Math with anxiety
EXAMPLE: LSAY WITH COVARIATECategorical Latent Variables*C#1 ON
FEMALE 0.320 0.217 1.476 0.140C#2 ON
FEMALE -0.343 0.269 -1.274 0.203C#3 ON
FEMALE 0.485 0.266 1.823 0.068C#4 ON
FEMALE 0.865 0.258 3.356 0.001
*Class 5 is reference group
There is a statistically significant overall association with gender and math deposition: - Null Model (no effect of female) vs. Alt. Model (c on female):
, df = 4, p<.001) - Interpretation of coefficients:- Given membership in either Class 1 or 5, girls are as likely to be in Class 1 as boys
(p=.14).- Given membership in either Class 2 or 5, girls are as likely to be in Class 2 as boys
(p=.20). - Etc.
1-Pro-math without anxiety, 2-Pro-math with anxiety, 3- Math Lover, 4- I don’t like math but I know it’s good for me, 5- Anti-Math with anxiety
EXAMPLE: LSAY WITH COVARIATE
ALTERNATIVE PARAMETERIZATIONS FOR THE CATEGORICAL LATENT VARIABLE REGRESSION
Parameterization using Reference Class 1
C#2 ONFEMALE -0.662 0.205 -3.223 0.001
C#3 ONFEMALE 0.165 0.207 0.798 0.425
C#4 ONFEMALE 0.545 0.187 2.916 0.004
C#5 ONFEMALE -0.320 0.217 -1.476 0.140
100
Switching the reference group to Class 1:
© Masyn (2013)
1-Pro-math without anxiety, 2-Pro-math with anxiety, 3- Math Lover, 4- I don’t like math but I know it’s good for me, 5- Anti-Math with anxiety
LCA Workshop
EXAMPLE: LSAY WITH COVARIATE
101© Masyn (2013) LCA Workshop
“1-STEP” APPROACH FORLATENT CLASS PREDICTORS
© Masyn (2013) LCA Workshop- 102 -
LCR MODELING PROCESS1. Fit models without covariates first.2. Decide on the number of classes.3. Integrate covariate (indirect) effects in a
systematic way. (You can preview covariate, x, using auxiliary = x (r) or (r3step) option in Variable command.) Include indirect effects (class predictors) first with direct effects @0 and then explore the evidence for direct effects using modindices.
4. Add direct effects as suggest by modindices but do not vary across class.
5. Trim until only significant direct effects remain.NOTE: This is just like MIMIC modeling in SEMAlso NOTE: There are other approaches currently in development for detection of direct effects and DIF more generally.
103© Masyn (2013) LCA Workshop
WHY NOT ADD CLASS-VARYINGDIRECT EFFECTS?
u1 u2 u3 u4 u5
C
Indirect Effect;Mplus:%overall%C on X;
Direct effect:%overall%U4 on X; Class-varying Direct Effect:
%c#1%U4 on X;
%c#2%U4 on X;
104
Covariate
© Masyn (2013) LCA Workshop
“OLD” 3-STEP APPROACH FORLATENT CLASS PREDICTORS
© Masyn (2013) LCA Workshop- 105 -
• Estimate the LCA model• Determine each subject’s most likely class
membership (“hard” classify people using modal class assingment)
• Save the class assignment and use in separate analysis as observed multinomial outcome to relate predictors to class membership.
• Problematic: Unless the classification is very good (high entropy), this gives biased estimates and biased standard errors for the relationships of class membership with other variables.
© Masyn (2013) LCA Workshop- 106 -
NEW 3-STEP APPROACH FORLATENT CLASS PREDICTORS
© Masyn (2013) LCA Workshop- 107 -
BASIC IDEA
• The real problem with the classify-analyze (“old” 3-step approach) is that it ignores the uncertainly/imprecision in classification.
• Based on the results of the unconditional LCA, we can compile information about classification quality that we can then use in a subsequent model (akin to using a previously estimated scale reliability to specify the measurement error variance in an SEM model).– The information is summarized in: Logits for the
Classification Probabilities for Most Likely Class Membership (Row) by Latent Class (Column)
© Masyn (2013) LCA Workshop- 108 -
• Average Latent Class Probabilities for Most Likely Latent Class Membership (Row) by Latent Class (Column) estimates
Pr(C = j | CMOD = k) for j=1, ,K, k=1, ,K• Classification Probabilities for the Most Likely
Latent Class Membership (Row) by Latent Class (Column) estimates
Pr(CMOD = k | C = j) for j=1, ,K, k=1, ,K• How do you get from one quantity to the
others? Bayes' Theorem:
© Masyn (2013) LCA Workshop- 109 -
1. Estimate the LCA model2. Create a nominal most likely class variable,
CMOD3. Use a mixture model for CMOD, C, and X,
where CMOD is the nominal indicator of C with measurement error rates prefixed at the misclassification rates of the estimated model in the step 1 LCA.
The information is summarized in: Logits of Average Latent Class Probabilities for Most Likely Class Membership (Row) by Latent Class (Column)
To do this in Mplus for X, use auxiliary = X (r3step) option in Variable command.
© Masyn (2013) LCA Workshop- 110 -
© Masyn (2013) LCA Workshop- 111 -
CMOD
X
C
Fixed according to Step 1 misclassification rates
Estimated
MANUAL R3STEPSTEP 1:• Run model with covariate(s) as auxiliary variable.
Include SAVEDATA:File is step1save.dat;SAVE=CPROB;
STEP 2:• Create new input file using
DATA:File is step1save.dat;VARIABLE:UseVar = cmod x;Nominal = cmod;
© Masyn (2013) LCA Workshop- 112 -
• Use value from the rows of the Logits for the Classification Probabilities for the Most Likely Latent Class Membership (Row) by Latent Class (Column) table in Step 1 output to fix the class-specific multinomial intercepts for cmod.
Step 3:• Specify LCR of “c on x” and run.
© Masyn (2013) LCA Workshop- 113 -
DISTAL OUTCOMES
© Masyn (2013) LCA Workshop- 114 -
DISTAL OUTCOMES AND MIXTUREMODELS
Distal Outcome
u1 u2 u3 u4 u5
C
115© Masyn (2013) LCA Workshop
AN EVER-GROWING # OF APPROACHES
• 1-step• “Old” 3-step (classify-analyze)• Modified 1-step• Pseudo-class draws
– Auxiliary = z (E);• New 3-step
– Auxiliary = z (DU3step) or (DE3step)– Manual 3-step
• New Bayes’ Theorem approach by Lanza et al. (2013)– Auxiliary = z (DCON) or (DCAT)
© Masyn (2013) LCA Workshop- 116 -
1-STEP
• Also referred to as the “distal-as-indicator” approach.
• Distal is treated as an additional latent class indicator if included as endogenous variable– This means you latent class variable is now
specified as measured by all the items andthe distals.
– This may be what you intend but, if so, the distals should be included as indicators from the get-go.
© Masyn (2013) LCA Workshop- 117 -
NOT GOOD OR BAD, JUST MAYBE NOTWHAT YOU WANT
• What if you don’t want your distal outcomes to characterized/measure the latent class variable?
• All the other existing approaches are an attempt to keep the distal outcome from influencing the class formation.
© Masyn (2013) LCA Workshop- 118 -
ALTERNATIVES TO DISTAL-AS-INDICATOR
• Old 3-step has the same problems as it does for latent class regression
• Modified 1-step fixes all measurement parameters (e.g., item thresholds) at their estimated values from the unconditional model.
© Masyn (2013) LCA Workshop- 119 -
• New 3-step– Done the same as for the LCR. Mplus will test for
differences in means assuming equal variances (DE3step) or allowing unequal variances (DU3step).
– Mplus implementation is limited but you can always do a manual 3-step in order to analyze multiple distal outcomes at the same time while including covariates, potential moderators, etc.
– WARNING: The 3-step approach does not guarantee that your distal will not influence the latent class formation. Mplus checks for this now—you have to check yourself if using manual 3-step.
© Masyn (2013) LCA Workshop- 120 -
AUXILIARY = Z (DCON/DCAT)• Based on clever application of Bayes’ Theorem by
Lanza et al. (2013)
• Basic idea: Regress C on Z to obtain Pr(C|Z) and Pr(C), estimate the density function of Z for Pr(Z) and then apply Bayes’ Theorem to get Pr(Z|C).
• This technique does better w.r.t. not allowing Z to influence class formation, but is very limited w.r.t. to the structural models that can be specified (e.g., one distal at a time, must assume distal independent of covariates, etc.)
© Masyn (2013) LCA Workshop- 121 -
MIXTURE MODEL BUILDING STEPS
1. Data screening (and unconditional, saturated non-mixture model if applicable)
2. Class enumeration process (without covariates)a) Enumeration (within each k structure if applicable)b) Comparisons of most plausible models from (a).NOTE: You may end up going through this step multiple times as you may realize to need to modify or reconsider your set of class indicators.
3. Select final unconditional model.4. Add potential predictors; Consider both prediction
of class membership and also possibly measurement non-invariance/DIF
5. Conditional mixture model with distal outcomes: Add potential distal outcomes of class membership.
© Masyn (2013) 122 LCA Workshop
MODELING EXTENSIONS
© Masyn (2013) LCA Workshop- 123 -
PREDICTORS AND DISTALS= LC MEDIATION!
124© Masyn (2013) LCA Workshop
REGRESSION MIXTURE MODELS
125© Masyn (2013) LCA Workshop
HIGHER-ORDER LATENT CLASS
C1 C3C2
C
© Masyn (2013) 126 LCA Workshop
MULTIPLE GROUP LCA(USES KNOWNCLASS OPTION)
C1
CG
© Masyn (2013) 127 LCA Workshop
MULTILEVEL LCA
128© Masyn (2013) LCA Workshop
GENERAL FACTOR MIXTURE MODEL
129© Masyn (2013) LCA Workshop
f1 f3f2
C
© Masyn (2013) 130 LCA Workshop
SPECIFIC FACTOR MIXTURE MODEL
MANY OTHER EXTENSIONS
• Latent class causal models– Complier average causal effects– Latent class causal mediation models– Causal effects of latent class membership
• Mixture IRT• Pattern mixture models for missing data• Etc.• Etc.• Etc.
© Masyn (2013) LCA Workshop- 131 -
LONGITUDINAL MIXTURE MODELS
© Masyn (2013) LCA Workshop- 132 -
LONGITUDINAL LCA (LLCA) / RMLCA
© Masyn (2013) LCA Workshop- 133 -
LONGITUDINAL LCA
• Use latent class variable to characterize longitudinal response patterns.
• The EXACT same modeling process as for LCA/LPA!
• The EXACT same syntax in Mplus.– The only differences is that in your data, u1-
uM or y1-yM are single variables measured at multiple time points rather than multiple measures at single time point.
© Masyn (2013) LCA Workshop- 134 -
GROWTH MIXTURE MODELS
© Masyn (2013) LCA Workshop- 135 -
c
10
Y 2Y 1 Y 3 Y 4
x
u
z
GENERAL GROWTH MIXTURE MODEL (GGMM)
AGGRESSION DEVELOPMENT: CONTROL AND INTERVENTION GROUPS
LATENT TRANSITION ANALYSIS (LTA)
© Masyn (2013) LCA Workshop
C2=1 C2=2 C2=3
C1=1 Pr(1 1) Pr(1 2) Pr(1 3)
C1=2 Pr(2 1) Pr(2 2) Pr(2 3)
C1=3 Pr(3 1) Pr(3 2) Pr(3 3)
Time 1
Time 2
- 139 -
LTA• Begin with LCA/LPA models for each time
point separately. Use the same exact modeling process as for a single cross-sectional LCA/LPA.
• Bring the latent class variables together in a single model. Watch for label switching and actual changes in measurement model parameters at each wave with all time points in same model.– There is a LTA 3-Step. See NEW Webnote 15 for
more information• Bring in covariates and distal outcomes using
same approaches as for LCA/LPA.© Masyn (2013) LCA Workshop- 140 -
© Masyn (2013) LCA Workshop- 141 -
LTA with predictors that influence not only class membership at each time point but the transitions as well.
Here’s how you have to specify that in Mplus. You can rearrange results to address questions posed by model above.
MANY OTHER LONGITUDINALMIXTURE MODELS
• Survival mixture models• Latent change score mixture models• Onset-to-growth mixture models• Associative LTA• Latent transition growth mixture models• Etc.• Etc.• Etc.
© Masyn (2013) LCA Workshop- 142 -
PARTING WORDS
© Masyn (2013) LCA Workshop- 143 -
MIXTURE MODELS: LAUDED BY SOME
• Theoretical models that conceptualize individual differences at the latent level as differences in kind, that consider typologies or taxonomies, map directly onto analytic latent class models.
• Mixture models give us a great deal of flexibility in terms of how we characterize population heterogeneity and individual differences with respect to a latent phenomenon.
• Can help avoid serious distortions that can results from ignoring population heterogeneity if it is, indeed, present.
© Masyn (2013) 144 LCA Workshop
MIXTURE MODELS: IMPUGNED BY OTHERS
• Latent classes or mixtures may not reflect the Truth.
• Nominalistic fallacy: Naming the latent classes does not necessarily make them what we call them or ensure that we understand them.
• Reification: Just because the model yield latent classes doesn’t mean the latent classes are real or that we’ve done anything to prove their existence.
© Masyn (2013) 145 LCA Workshop
• The empirically extracted latent classes depend upon the within- and between-class model specification and the joint distribution of the indicators. Thus, the resultant classes may diverge markedly from the underlying “True” latent structure in the population.
• Do these criticisms sound familiar? They are nearly identical to the critique of path analysis and SEM in the second half of the 20th century because some of the same bad modeling practices have reappeared:– “Nobody pays much attention to the
assumptions, and the technology tends to overwhelm common sense.” (Friedman, 1987)
© Masyn (2013) 146 LCA Workshop
DON’T CUT OFF YOUR LATENT CLASSESTO SPITE YOUR MODEL
• Any model is, at best, an approximation to reality.• “All models are wrong, but some are useful”. (George
Box)• We can evaluate model-theory consistency. • We can evaluate model-data consistency. • There are many alternative ways of thinking about
relationships in a variable system and if mixture modeling can be useful in empirically distinguishing between or among alternative perspectives, then they provide important information.
© Masyn (2013) 147 LCA Workshop
• Understanding individual differences is paramount in social and developmental research.
• The flexibility we gain in the parameterization of individual differences using mixtures extends to flexibility in prediction of those differences and prediction from those differences.
© Masyn (2013) 148 LCA Workshop
MIXTURE MODEL CARE AND FEEDING• Be sure to very carefully document your model building and
selection for yourself and reviewers. Be prepared to defend your modeling choices in the event you get a reviews that is more skeptical than most about the methodology.
• Resist the temptation to take your discrete representation of population heterogeneity and claim and interpret and discuss the resultant classes as if you had established their existence (e.g., if you fit a three class model and you get a three class solution, you haven’t proved the existence of three classes generally nor those three classes specifically).
• In designing studies in which you plan to do LCA/LPA, don’t formulate hypotheses such as “There will be four classes of engagement” because the exploratory class enumeration process doesn’t actually test K=4 versus K 4. This also makes it impossible to compute power.
© Masyn (2013) LCA Workshop- 149 -
• Don’t be afraid to do some sensitivity analyses to understand the hierarchy of influence in your variable system and the vulnerability of your latent class formations to small shifts in that system.
• Don’t check your common sense and broader modeling skills at the door when embarking on LCA/LPA. There are some modeling best-practices that translate extremely well to the LCA setting.
• Don’t get so overwhelmed with all the fit indices, etc. that you forget to fully evaluate the substantive utility and meaning in the resultant classes.
• Don’t be so dazzled by your own results that you aren’t able to effective and critically evaluate them with respect to validity criteria.
• Don’t fall so deeply in love with mixture modeling that it becomes your default analytic approach with any multivariate data.
© Masyn (2013) LCA Workshop- 150 -
QUESTIONS?
THANK YOU!
© Masyn (2013) LCA Workshop- 151 - © Masyn (2013) LCA Workshop- 152 -
SELECT REFERENCES & RESOURCES
© Masyn (2013) LCA Workshop- 153 -
• Mplus websitewww.statmodel.com
• Latent GOLD websitehttp://statisticalinnovations.com/products/latentgold.html
• Penn State Methodology Centerhttp://methodology.psu.edu/
• UCLA Institute for Digital Research & Educ.https://idre.ucla.edu/stats
For more, see the text and references of: Masyn, K. (2013). Latent class analysis and finite mixture modeling. In T. D. Little (Ed.) The Oxford handbook of quantitative methods in psychology (Vol. 2, pp. 551-611). New York, NY: Oxford University Press.