Lab 10.1 - Exploring Response Patterns and Model Fit in Latent Class Analysis

Structural Equation Modeling - Instructor: Karen Nylund-Gibson

Adam Garber

June 07, 2020

University of California, Santa Barbara


Lab preparation


Creating a version-controlled R-Project with Github

Download repository here: https://github.com/garberadamc/SEM-Lab10

On the Github repository webpage:

  1. fork your own branch of the lab repository
  2. copy the repository web URL address from the clone or download menu

Within R-Studio:

  1. click “NEW PROJECT”
  2. choose option Version Control
  3. choose option Git
  4. paste the repository web URL path copied from the clone or download menu on Github page
  5. choose location of the R-Project

Exploring observed response patterns


Load data

Use {DT::datatable()} to take a look at the data

Figure. Path diagram of science attitude indicators.


Save response frequencies for the 4 class model with response is _____.dat.

Read in observed response pattern data

Order responses by highest frequency

Use {gt} to make a nicely formatted table

Observed response patterns, estimated frequencies, estimated posterior class probabilities, and modal class assignment.
Frequency Enjoy Useful Logical Job Adult Pk=1 Pk=2 Pk=3 Pk=4 k
Unconditional response patterns ordered by highest frequency
558 0 0 0 0 0 0.000 0.117 0.000 0.883 4
529 1 1 1 1 1 0.957 0.000 0.043 0.000 1
313 1 0 0 0 0 0.000 0.307 0.000 0.693 4
135 1 0 1 0 0 0.002 0.977 0.000 0.021 2
94 1 1 1 0 1 0.687 0.000 0.313 0.000 1
k=1 conditional response pattern ordered by highest frequency
529 1 1 1 1 1 0.957 0.000 0.043 0.000 1
94 1 1 1 0 1 0.687 0.000 0.313 0.000 1
78 0 1 1 1 1 0.859 0.000 0.141 0.000 1
62 1 1 0 1 1 0.580 0.000 0.420 0.000 1
55 1 1 1 1 0 0.650 0.350 0.000 0.000 1
k=2 conditional response pattern ordered by highest frequency
135 1 0 1 0 0 0.002 0.977 0.000 0.021 2
88 0 0 1 0 0 0.000 0.934 0.000 0.066 2
74 1 1 1 0 0 0.063 0.937 0.000 0.000 2
47 1 1 0 0 0 0.006 0.994 0.000 0.000 2
44 1 0 0 1 0 0.004 0.643 0.000 0.353 2
k=3 conditional response pattern ordered by highest frequency
91 1 0 0 0 1 0.003 0.000 0.937 0.060 3
88 1 0 1 1 1 0.337 0.000 0.663 0.000 3
76 1 0 1 0 1 0.048 0.000 0.951 0.001 3
70 1 0 0 1 1 0.031 0.000 0.964 0.006 3
53 0 0 0 0 1 0.001 0.000 0.763 0.236 3
k=4 conditional response pattern ordered by highest frequency
558 0 0 0 0 0 0.000 0.117 0.000 0.883 4
313 1 0 0 0 0 0.000 0.307 0.000 0.693 4
53 0 0 0 1 0 0.000 0.353 0.000 0.647 4
11 0 0 NA 0 0 0.000 0.231 0.000 0.769 4
9 0 NA 0 0 0 0.000 0.170 0.000 0.829 4
Data Source: Longitudinal Study of American Youth.

Visualizing observed response patterns


Order rows by modal assignment (K)

Prepare plot data

Visualize observed response patterns with {plotly}

Make a 3D plot with packages {ggplot2}, {gg3D}, and {gganimate}.


Comparing model fit

Learning objective: Generate a comprehensive model fit summary table.

Information criteria: model is endorsed by lowest value:

  • BIC: \[ =-2*LL+Npar*LN(N) \]
  • aBIC: \[-2*LL+Npar*LN((N+2)/24)\]
  • CIAC: \[-2*LL+Npar*(LN(N)+1))\]
  • AWE: \[ -2*LL+2*Npar*(LN(N)+1.5) \]

Comparing models:

  • VLMR: Vuong-Lo-Mendell-Rubin LRT (TECH11)
  • BLRT: Bootstrap LRT (TECH14)
  • BF: Bayes Factor
  • cmP(k): Correct Model Probability

Run a quick enumeration


Create model fit summary table


Extract data and calculate indices derived from the Log Likelihood

## <simpleError in file(file, "rt"): cannot open the connection>

Format table with package {gt}

Model Fit Summary Table
Classes NPar LL BIC aBIC CIAC AWE VLMR BLRT Bayes
Factor
cmPk
Class 1 5.00 −10,250.60 20,541.34 20,525.45 20,546.34 20,596.47 NA NA 0 0
Class 2 11.00 −8,785.32 17,658.92 17,623.97 17,669.93 17,780.22 0.00 0.00 0 0
Class 3 17.00 −8,693.57 17,523.59 17,469.57 17,540.59 17,711.04 0.00 0.00 0 0
Class 4 23.00 −8,664.09 17,512.79 17,439.71 17,535.79 17,766.40 0.00 0.00 5.22B 0.5
Class 5 29.00 −8,662.39 17,557.54 17,465.40 17,586.54 17,877.31 0.67 1.00 12.32B 0
Class 6 35.00 −8,661.54 17,604.01 17,492.80 17,639.01 17,989.94 0.75 1.00 0 0
Step1 - 3step LSAY - Lab9 23.00 −8,664.09 17,512.79 17,439.71 17,535.79 17,766.40 0.00 0.00 NA 0.5
Data Source: Longitudinal Study of American Youth.

References

Drew A. Linzer, Jeffrey B. Lewis (2011). poLCA: An R Package for Polytomous Variable Latent Class Analysis. Journal of Statistical Software, 42(10), 1-29. URL http://www.jstatsoft.org/v42/i10/.

Hallquist, M. N., & Wiley, J. F. (2018). MplusAutomation: An R Package for Facilitating Large-Scale Latent Variable Analyses in Mplus. Structural equation modeling: a multidisciplinary journal, 25(4), 621-638.

Miller, J. D., Hoffer, T., Suchner, R., Brown, K., & Nelson, C. (1992). LSAY codebook. Northern Illinois University.

Muthén, B. O., Muthén, L. K., & Asparouhov, T. (2017). Regression and mediation analysis using Mplus. Los Angeles, CA: Muthén & Muthén.

Muthén, L.K. and Muthén, B.O. (1998-2017). Mplus User’s Guide. Eighth Edition. Los Angeles, CA: Muthén & Muthén

R Core Team (2017). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL http://www.R-project.org/

Wickham et al., (2019). Welcome to the tidyverse. Journal of Open Source Software, 4(43), 1686, https://doi.org/10.21105/joss.01686


 

A work by Adam Garber

agarber@ucsb.edu