Loglinear LCA

Author

DL Oberski

Loglinear formulation of LCA

The loglinear formulation of LCA is very powerful. It only works for categorical data, but, in addition to the “plain vanilla” kind of locally independent LCA, it allows for:

  • Built-in handling of missing data (everything is a missing data problem);
  • NMAR (“unseen data-dependent”) models;
  • Multiple latent class variables, unlocking arbitrary modelling with latent variables, i.e. “categorical SEM”;
  • “Linear by linear” and “linear by nominal” effects, making for more parsimonious models, unlocking:
  • Discretized, “nonparametric” versions of IRT models (“DFactor models” in Latent Gold)
  • Covariates/concomitant variables, including “measurement invariance” models;
  • Facilities for complex sampling designs, i.e. weighting, clustering, and stratification;
  • and more.

The largest drawback is that, as implemented in most software (including the excellent cvam package in R), it is extremely intensive on computer memory. This means that most models that run easily in other software will likely refuse to run on your laptop, giving an “out of memory” error.

The loglinear formulation of LCA is the basis of the powerful Latent Gold software, as well as its predecessor, the still-popular LEM program (https://jeroenvermunt.nl/). Latent Gold uses many optimizations, which mostly allow it to circumvent the “small models only” problem. Here, we will use cvam, whose implementation is very similar to LEM.

To demonstrate the uses of loglinear LCA, we use the carcinoma data from the poLCA package.

Simple analysis using poLCA

  A B C D E F G
1 1 1 1 1 1 1 1
2 2 2 2 2 2 2 2
3 1 1 1 1 1 1 1
4 2 2 1 1 2 1 2
5 2 2 2 2 2 2 2
6 1 1 1 1 1 1 1
f_carcinoma <- cbind(A, B, C, D, E, F, G) ~ 1
fit_polca <- poLCA(f_carcinoma, 
                   nclass = 2,
                   data = carcinoma, 
                   nrep = 10,
                   verbose = FALSE)

fit_polca
Conditional item response (column) probabilities,
 by outcome variable, for each class (row) 
 
$A
           Pr(1)  Pr(2)
class 1:  0.0000 1.0000
class 2:  0.8835 0.1165

$B
           Pr(1)  Pr(2)
class 1:  0.0169 0.9831
class 2:  0.6456 0.3544

$C
           Pr(1)  Pr(2)
class 1:  0.2391 0.7609
class 2:  1.0000 0.0000

$D
           Pr(1)  Pr(2)
class 1:  0.4589 0.5411
class 2:  1.0000 0.0000

$E
           Pr(1)  Pr(2)
class 1:  0.0214 0.9786
class 2:  0.7771 0.2229

$F
           Pr(1)  Pr(2)
class 1:  0.5773 0.4227
class 2:  1.0000 0.0000

$G
           Pr(1)  Pr(2)
class 1:  0.0000 1.0000
class 2:  0.8835 0.1165

Estimated class population shares 
 0.5012 0.4988 
 
Predicted class memberships (by modal posterior prob.) 
 0.5 0.5 
 
========================================================= 
Fit for 2 latent classes: 
========================================================= 
number of observations: 118 
number of estimated parameters: 15 
residual degrees of freedom: 103 
maximum log-likelihood: -317.2568 
 
AIC(2): 664.5137
BIC(2): 706.0739
G^2(2): 62.36543 (Likelihood ratio/deviance statistic) 
X^2(2): 92.64814 (Chi-square goodness of fit) 
 

From the poLCA model, using the bivariate residual (BVR) functionality in the poLCA.extras package, we can see that there are some considerable local dependencies.

bvr(fit_polca) |> round(1)
    A   B   C   D   E   F
B 1.9                    
C 0.5 4.1                
D 0.5 1.2 1.5            
E 0.6 8.3 0.6 9.3        
F 0.5 1.2 1.9 8.2 0.6    
G 7.1 5.7 1.5 1.0 1.9 0.9
bootstrap_bvr_pvals(f_carcinoma, 
                    fit_polca = fit_polca, 
                    data = carcinoma, R = 200)
      A     B     C     D     E     F
B 0.205                              
C 0.190 0.090                        
D 0.105 0.360 0.150                  
E 0.645 0.000 0.590 0.065            
F 0.080 0.290 0.095 0.000 0.450      
G 0.005 0.010 0.035 0.020 0.170 0.015

Simple and locally dependent LCA using cvam

library(cvam)

options(contrasts = rep("contr.sum", 2))

dat <- carcinoma %>% mutate_all(as.factor)
df_freq <- table(dat, useNA = "ifany") |> as.data.frame()

# Create a new variable that is completely missing (and has 2 cats)
df_freq$X <- latentFactor(NROW(df_freq), 2)

# Create formula
formula <- 
  paste(names(df_freq)[!names(df_freq) %in% c("Freq", "X")],
        collapse = " + ") %>% 
  sprintf("~ X * (%s)", .) %>%
  as.formula()

system.time(
  fit_cvam <- cvam(formula, data = df_freq,
                   freq = Freq,
                   control = list(startValJitter = 0.1)
))
Note: Estimate at or near boundary
Estimated variances may be unreliable
   user  system elapsed 
  0.067   0.001   0.069 
summary(fit_cvam)
~ X * (A + B + C + D + E + F + G)

Prior:
      Flattening frequency = 0
Total nuggets + flattening = 0
              Ridge factor = 0
          Intensity factor = 1

Sample size:
              total N in supplied data = 118
N from supplied data used in model fit = 118
           prior effective sample size =   0

Degrees of freedom:
    patterns of coarsened data = 128
  cells in complete-data table = 256
cells without latent variables = 128
         structural zero cells =   0
   parameters in Poisson model =  16
                            df = 112

Starting values:
default, center
jitter SD = 0.100000

EM algorithm:
Converged at iteration 42
Gradient length = 0.000007

  Final logP = 127.6839
Final loglik = 127.6839

Estimates from EM, with Hessian-based SEs
                coef       SE zstat   pval
(Intercept) -23.4100 669.5679 -0.03 0.9721
X1            3.3833 669.5680  0.01 0.9960
A1           -3.8298 315.4106 -0.01 0.9903
B1           -0.8658   0.2612 -3.31 0.0009
C1            4.2032 261.6161  0.02 0.9872
D1            4.4529 266.4354  0.02 0.9867
E1           -0.6439   0.2572 -2.50 0.0123
F1            4.5816 266.0340  0.02 0.9863
G1           -4.0925 372.3337 -0.01 0.9912
X1:A1        -4.8428 315.4106 -0.02 0.9877
X1:B1        -1.1657   0.2614 -4.46 0.0000
X1:C1        -4.7819 261.6161 -0.02 0.9854
X1:D1        -4.5352 266.4354 -0.02 0.9864
X1:E1        -1.2683   0.2590 -4.90 0.0000
X1:F1        -4.4257 266.0340 -0.02 0.9867
X1:G1        -5.1055 372.3337 -0.01 0.9891
what <- paste0("~", LETTERS[1:7], "|X") |> sapply(as.formula)
est_indep <- cvamEstimate(c(~X, what), fit_cvam)
est_indep
Estimates and SE's from EM, linearized
~ X
  X   prob     SE prob.lower prob.upper
1 1 0.5012 0.0463     0.4113     0.5910
2 2 0.4988 0.0463     0.4090     0.5887
~ A | X
  X A   prob     SE prob.lower prob.upper
1 1 1 0.0000 0.0000     0.0000     1.0000
2 1 2 1.0000 0.0000     0.0000     1.0000
3 2 1 0.8835 0.0429     0.7702     0.9449
4 2 2 0.1165 0.0429     0.0551     0.2298
~ B | X
  X B   prob     SE prob.lower prob.upper
1 1 1 0.0169 0.0168     0.0024     0.1105
2 1 2 0.9831 0.0168     0.8895     0.9976
3 2 1 0.6456 0.0627     0.5156     0.7572
4 2 2 0.3544 0.0627     0.2428     0.4844
~ C | X
  X C   prob     SE prob.lower prob.upper
1 1 1 0.2391 0.0561     0.1466     0.3650
2 1 2 0.7609 0.0561     0.6350     0.8534
3 2 1 1.0000 0.0000     0.0000     1.0000
4 2 2 0.0000 0.0000     0.0000     1.0000
~ D | X
  X D   prob     SE prob.lower prob.upper
1 1 1 0.4589 0.0651     0.3367     0.5863
2 1 2 0.5411 0.0651     0.4137     0.6633
3 2 1 1.0000 0.0000     0.0000     1.0000
4 2 2 0.0000 0.0000     0.0000     1.0000
~ E | X
  X E   prob     SE prob.lower prob.upper
1 1 1 0.0214 0.0206     0.0032     0.1305
2 1 2 0.9786 0.0206     0.8695     0.9968
3 2 1 0.7771 0.0545     0.6530     0.8659
4 2 2 0.2229 0.0545     0.1341     0.3470
~ F | X
  X F   prob     SE prob.lower prob.upper
1 1 1 0.5773 0.0644     0.4488     0.6961
2 1 2 0.4227 0.0644     0.3039     0.5512
3 2 1 1.0000 0.0000     0.0000     1.0000
4 2 2 0.0000 0.0000     0.0000     1.0000
~ G | X
  X G   prob     SE prob.lower prob.upper
1 1 1 0.0000 0.0000     0.0000     1.0000
2 1 2 1.0000 0.0000     0.0000     1.0000
3 2 1 0.8835 0.0429     0.7702     0.9449
4 2 2 0.1165 0.0429     0.0551     0.2298

(Note that the Hessian-based se’s cannot be trusted. We can look at the estimated se’s in the marginal probability tables above, or use cvam’s MCMC option to obtain standard errors on the loglinear parmeters.)

Now we update the model with the local dependences, held equal across classes.

# formula_ld <- update(formula, ~ . + D:F + A:G + B:E)
formula_ld <- update(formula, ~ . + (A+B+D+E+F+G)^2)

system.time(
  fit_cvam_ld <- 
    cvam(formula_ld, data = df_freq, freq = Freq,
         control = list(startValJitter = 0.05, iterMaxEM = 1000),
         prior = cvam::cvamPrior(ridge = 0.1)
))
Note: Estimate at or near boundary
Estimated variances may be unreliable
   user  system elapsed 
  0.154   0.001   0.156 
summary(fit_cvam_ld)
~ X + A + B + C + D + E + F + G + X:A + X:B + X:C + X:D + X:E + X:F + X:G + A:B + A:D + A:E + A:F + A:G + B:D + B:E + B:F + B:G + D:E + D:F + D:G + E:F + E:G + F:G

Prior:
      Flattening frequency = 0.0
Total nuggets + flattening = 0.0
              Ridge factor = 0.1
          Intensity factor = 1.0

Sample size:
              total N in supplied data = 118
N from supplied data used in model fit = 118
           prior effective sample size =   0

Degrees of freedom:
    patterns of coarsened data = 128
  cells in complete-data table = 256
cells without latent variables = 128
         structural zero cells =   0
   parameters in Poisson model =  31
                            df =  97

Starting values:
default, center
jitter SD = 0.050000

EM algorithm:
Converged at iteration 48
Gradient length = 0.000007

  Final logP = 156.3212
Final loglik = 156.3619

Estimates from EM, with Hessian-based SEs
                coef     SE zstat   pval
(Intercept) -7.44766 1.2413 -6.00 0.0000
X1           0.05773 1.1782  0.05 0.9609
A1          -1.07054 0.8586 -1.25 0.2124
B1          -0.99729 0.6410 -1.56 0.1198
C1           0.91826 0.4892  1.88 0.0605
D1           1.37380 0.6926  1.98 0.0473
E1          -0.63117 0.6675 -0.95 0.3444
F1           1.44479 0.6807  2.12 0.0338
G1          -0.61377 0.9487 -0.65 0.5177
X1:A1       -1.15492 0.5704 -2.02 0.0429
X1:B1        0.86834 0.5993  1.45 0.1473
X1:C1       -1.80735 0.4922 -3.67 0.0002
X1:D1       -0.43269 0.4615 -0.94 0.3484
X1:E1       -1.20903 0.6867 -1.76 0.0783
X1:F1       -0.90714 0.9970 -0.91 0.3629
X1:G1       -1.15465 0.6863 -1.68 0.0925
A1:B1        0.46370 0.2411  1.92 0.0544
A1:D1        0.54873 0.5507  1.00 0.3190
A1:E1        0.02503 0.2238  0.11 0.9109
A1:F1        0.05476 0.7841  0.07 0.9443
A1:G1        0.26642 0.2274  1.17 0.2413
B1:D1        0.39565 0.5548  0.71 0.4758
B1:E1        0.63904 0.2244  2.85 0.0044
B1:F1        0.27186 0.5824  0.47 0.6407
B1:G1        0.97343 0.5155  1.89 0.0590
D1:E1       -0.42058 0.4911 -0.86 0.3917
D1:F1        0.37588 0.1552  2.42 0.0154
D1:G1        0.64085 0.5849  1.10 0.2732
E1:F1        0.15256 0.7192  0.21 0.8320
E1:G1        0.47018 0.2358  1.99 0.0462
F1:G1       -0.05171 0.8287 -0.06 0.9502
est_locdep <- cvamEstimate(c(~X, what), fit_cvam_ld)
est_locdep
Estimates and SE's from EM, linearized
~ X
  X   prob     SE prob.lower prob.upper
1 1 0.4443 0.0502     0.3493     0.5436
2 2 0.5557 0.0502     0.4564     0.6507
~ A | X
  X A   prob     SE prob.lower prob.upper
1 1 1 0.0050 0.0097     0.0001     0.1889
2 1 2 0.9950 0.0097     0.8111     0.9999
3 2 1 0.7905 0.0582     0.6545     0.8826
4 2 2 0.2095 0.0582     0.1174     0.3455
~ B | X
  X B   prob     SE prob.lower prob.upper
1 1 1 0.0226 0.0203     0.0038     0.1232
2 1 2 0.9774 0.0203     0.8768     0.9962
3 2 1 0.5794 0.0646     0.4503     0.6984
4 2 2 0.4206 0.0646     0.3016     0.5497
~ C | X
  X C   prob     SE prob.lower prob.upper
1 1 1 0.1445 0.0629     0.0587     0.3141
2 1 2 0.8555 0.0629     0.6859     0.9413
3 2 1 0.9957 0.0081     0.8501     0.9999
4 2 2 0.0043 0.0081     0.0001     0.1499
~ D | X
  X D   prob     SE prob.lower prob.upper
1 1 1 0.4202 0.0710     0.2905     0.5619
2 1 2 0.5798 0.0710     0.4381     0.7095
3 2 1 0.9720 0.0263     0.8391     0.9957
4 2 2 0.0280 0.0263     0.0043     0.1609
~ E | X
  X E   prob     SE prob.lower prob.upper
1 1 1 0.0048 0.0096     0.0001     0.1938
2 1 2 0.9952 0.0096     0.8062     0.9999
3 2 1 0.7146 0.0617     0.5805     0.8192
4 2 2 0.2854 0.0617     0.1808     0.4195
~ F | X
  X F   prob     SE prob.lower prob.upper
1 1 1 0.5242 0.0718     0.3852     0.6596
2 1 2 0.4758 0.0718     0.3404     0.6148
3 2 1 0.9948 0.0108     0.7542     0.9999
4 2 2 0.0052 0.0108     0.0001     0.2458
~ G | X
  X G   prob     SE prob.lower prob.upper
1 1 1 0.0046 0.0091     0.0001     0.1882
2 1 2 0.9954 0.0091     0.8118     0.9999
3 2 1 0.7905 0.0583     0.6543     0.8827
4 2 2 0.2095 0.0583     0.1173     0.3457
anova(fit_cvam, fit_cvam_ld, method = "BIC")
Model 1: ~ X * (A + B + C + D + E + F + G)
Model 2: ~ X + A + B + C + D + E + F + G + X:A + X:B + X:C + X:D + X:E + X:F + X:G + A:B + A:D + A:E + A:F + A:G + B:D + B:E + B:F + B:G + D:E + D:F + D:G + E:F + E:G + F:G
  resid.df -2*loglik     BIC rank
1      112   -255.37 -179.04    1
2       97   -312.72 -164.83    2

The local dependence model fits better.

Extensions to the basic LC model using loglinear formulation

Including covariates

Multiple latent variables

Example from Oberski (2016), available at https://doi.org/10.1007/s11634-015-0211-0 (open access).

library(readr)
library(cvam)
options(contrasts = rep("contr.sum", 2))

Data are from the LISS panel. People were asked 5 times if they had voted in the last parliamentary election. This pertained to two separate elections. The substantively inspired model is as shown.

flowchart TD
  X1((Voted 2006)) ---> A[2008]
  X1 ---> B[2009]
  X1 ---> C[2010]
  X2((Voted 2010)) ---> D[2011]
  X2 ---> E[2012]
  X1 --> X2

Read in the data.

path <- "https://daob.nl/files/lca/vote-LISS.dat.gz"
vote <- readr::read_table(path, na = ".")

rmarkdown::paged_table(vote)

We transform the data into frequencies (including the missing values). We also rename the variables A-E to write smaller formulas.

df_freq <- table(vote[, -1], useNA = "ifany") |> as.data.frame()
names(df_freq)[1:5] <- LETTERS[1:5]

We now formulate the model. It includes two different latent variables X1 and X2, which are meant to signify really voting in 2006 and 2010, respectively.

# Create a new variable that is completely missing (and has 2 cats)
df_freq$X1 <- latentFactor(NROW(df_freq), 2)
df_freq$X2 <- latentFactor(NROW(df_freq), 2)

# Create formula
formula <- ~ X1 * (A + B + C) + X2 * (D + E) + X1:X2

We use cvam to fit the model.

fit_cvam <- cvam(formula, data = df_freq, freq = Freq,
                 control = list(startValJitter = 0.1))

Some results of interest.

summary(fit_cvam)
~ X1 * (A + B + C) + X2 * (D + E) + X1:X2

Prior:
      Flattening frequency = 0
Total nuggets + flattening = 0
              Ridge factor = 0
          Intensity factor = 1

Sample size:
              total N in supplied data = 9510
N from supplied data used in model fit = 9510
           prior effective sample size =    0

Degrees of freedom:
    patterns of coarsened data = 243
  cells in complete-data table = 128
cells without latent variables =  32
         structural zero cells =   0
   parameters in Poisson model =  14
                            df =  18

Starting values:
default, center
jitter SD = 0.100000

EM algorithm:
Converged at iteration 206
Gradient length = 0.000005

  Final logP = 68648.1
Final loglik = 68648.1

Estimates from EM, with Hessian-based SEs
               coef      SE  zstat   pval
(Intercept) -0.1313 0.08044  -1.63 0.1027
X11          0.7904 0.11156   7.08 0.0000
A1          -0.7223 0.04392 -16.45 0.0000
B1          -0.4948 0.05053  -9.79 0.0000
C1          -0.1910 0.04447  -4.29 0.0000
X21         -0.2248 0.11662  -1.93 0.0539
D1          -0.6126 0.04945 -12.39 0.0000
E1          -0.2831 0.06674  -4.24 0.0000
X11:A1       1.2421 0.04424  28.08 0.0000
X11:B1       1.3693 0.04807  28.49 0.0000
X11:C1       1.2343 0.04326  28.54 0.0000
X21:D1      -1.2459 0.04699 -26.51 0.0000
X21:E1      -1.4374 0.06316 -22.76 0.0000
X11:X21     -1.0906 0.04178 -26.10 0.0000
what <- paste0("~", LETTERS[1:3], "|X1") |> sapply(as.formula)
what <- what %>% c(paste0("~", LETTERS[4:5], "|X2") |> sapply(as.formula))
what <- c(~X1, ~X2, ~X2 | X1, what)

est_indep <- cvamEstimate(what, fit_cvam)
est_indep
Estimates and SE's from EM, linearized
~ X1
  X1   prob     SE prob.lower prob.upper
1  1 0.1726 0.0057     0.1618     0.1841
2  2 0.8274 0.0057     0.8159     0.8382
~ X2
  X2   prob     SE prob.lower prob.upper
1  1 0.8383 0.0063     0.8256     0.8503
2  2 0.1617 0.0063     0.1497     0.1744
~ X2 | X1
  X1 X2   prob     SE prob.lower prob.upper
1  1  1 0.2448 0.0214     0.2052     0.2892
2  1  2 0.7552 0.0214     0.7108     0.7948
3  2  1 0.9622 0.0049     0.9513     0.9707
4  2  2 0.0378 0.0049     0.0293     0.0487
~ A | X1
  X1 A   prob     SE prob.lower prob.upper
1  1 0 0.7388 0.0204     0.6969     0.7767
2  1 1 0.2612 0.0204     0.2233     0.3031
3  2 0 0.0193 0.0027     0.0147     0.0253
4  2 1 0.9807 0.0027     0.9747     0.9853
~ B | X1
  X1 B   prob     SE prob.lower prob.upper
1  1 0 0.8518 0.0178     0.8135     0.8834
2  1 1 0.1482 0.0178     0.1166     0.1865
3  2 0 0.0235 0.0032     0.0180     0.0305
4  2 1 0.9765 0.0032     0.9695     0.9820
~ C | X1
  X1 C   prob     SE prob.lower prob.upper
1  1 0 0.8896 0.0151     0.8563     0.9159
2  1 1 0.1104 0.0151     0.0841     0.1437
3  2 0 0.0547 0.0043     0.0467     0.0638
4  2 1 0.9453 0.0043     0.9362     0.9533
~ D | X2
  X2 D   prob     SE prob.lower prob.upper
1  1 0 0.0237 0.0032     0.0182     0.0308
2  1 1 0.9763 0.0032     0.9692     0.9818
3  2 0 0.7801 0.0233     0.7310     0.8225
4  2 1 0.2199 0.0233     0.1775     0.2690
~ E | X2
  X2 E   prob     SE prob.lower prob.upper
1  1 0 0.0310 0.0043     0.0237     0.0405
2  1 1 0.9690 0.0043     0.9595     0.9763
3  2 0 0.9096 0.0179     0.8677     0.9391
4  2 1 0.0904 0.0179     0.0609     0.1323

As in the paper, we can also include two local dependencies by updating the formula and refitting.

# Update formula
formula_ld <- update(formula, ~ . + A:B + A:E)

fit_cvam_ld <- cvam(formula_ld, data = df_freq, freq = Freq,
                 control = list(startValJitter = 0.1))

The models are nested, so we can use the LRT to compare them.

anova(fit_cvam, fit_cvam_ld)
Model 1: ~ X1 * (A + B + C) + X2 * (D + E) + X1:X2
Model 2: ~ X1 + A + B + C + X2 + D + E + X1:A + X1:B + X1:C + X2:D + X2:E + X1:X2 + A:B + A:E
  resid.df -2*loglik df change
1       18   -137296          
2       16   -137364  2  67.86

The actual estimates are not too different.

est_locdep <- cvamEstimate(what, fit_cvam_ld)
est_locdep
Estimates and SE's from EM, linearized
~ X1
  X1   prob     SE prob.lower prob.upper
1  1 0.1851 0.0075     0.1709     0.2003
2  2 0.8149 0.0075     0.7997     0.8291
~ X2
  X2   prob     SE prob.lower prob.upper
1  1 0.1674 0.0069     0.1544     0.1813
2  2 0.8326 0.0069     0.8187     0.8456
~ X2 | X1
  X1 X2   prob     SE prob.lower prob.upper
1  1  1 0.7248 0.0277     0.6673     0.7756
2  1  2 0.2752 0.0277     0.2244     0.3327
3  2  1 0.0408 0.0058     0.0309     0.0537
4  2  2 0.9592 0.0058     0.9463     0.9691
~ A | X1
  X1 A   prob     SE prob.lower prob.upper
1  1 0 0.6550 0.0248     0.6049     0.7020
2  1 1 0.3450 0.0248     0.2980     0.3951
3  2 0 0.0272 0.0036     0.0210     0.0351
4  2 1 0.9728 0.0036     0.9649     0.9790
~ B | X1
  X1 B   prob     SE prob.lower prob.upper
1  1 0 0.7671 0.0248     0.7150     0.8121
2  1 1 0.2329 0.0248     0.1879     0.2850
3  2 0 0.0295 0.0040     0.0226     0.0384
4  2 1 0.9705 0.0040     0.9616     0.9774
~ C | X1
  X1 C   prob     SE prob.lower prob.upper
1  1 0 0.9210 0.0189     0.8752     0.9509
2  1 1 0.0790 0.0189     0.0491     0.1248
3  2 0 0.0351 0.0062     0.0248     0.0494
4  2 1 0.9649 0.0062     0.9506     0.9752
~ D | X2
  X2 D   prob     SE prob.lower prob.upper
1  1 0 0.7733 0.0239     0.7231     0.8167
2  1 1 0.2267 0.0239     0.1833     0.2769
3  2 0 0.0200 0.0035     0.0141     0.0281
4  2 1 0.9800 0.0035     0.9719     0.9859
~ E | X2
  X2 E   prob     SE prob.lower prob.upper
1  1 0 0.8879 0.0208     0.8401     0.9227
2  1 1 0.1121 0.0208     0.0773     0.1599
3  2 0 0.0298 0.0044     0.0223     0.0398
4  2 1 0.9702 0.0044     0.9602     0.9777

Ordinal indicators

First consider the four-category indicators as nominal.

data("election")

df_freq <- election[, 1:5] %>% table() %>% as.data.frame()
df_freq$X <- latentFactor(NROW(df_freq), 2)

# Create formula
formula <- ~ X * (MORALG + CARESG + KNOWG + LEADG + DISHONG)

fit_cvam <- cvam(formula, data = df_freq, freq = Freq,
                 control = list(startValJitter = 0.1))

summary(fit_cvam)
~ X * (MORALG + CARESG + KNOWG + LEADG + DISHONG)

Prior:
      Flattening frequency = 0
Total nuggets + flattening = 0
              Ridge factor = 0
          Intensity factor = 1

Sample size:
              total N in supplied data = 1495
N from supplied data used in model fit = 1495
           prior effective sample size =    0

Degrees of freedom:
    patterns of coarsened data = 1024
  cells in complete-data table = 2048
cells without latent variables = 1024
         structural zero cells =    0
   parameters in Poisson model =   32
                            df =  992

Starting values:
default, center
jitter SD = 0.100000

EM algorithm:
Converged at iteration 115
Gradient length = 0.000034

  Final logP = 1178.373
Final loglik = 1178.373

Estimates from EM, with Hessian-based SEs
                coef      SE  zstat   pval
(Intercept) -3.63815 0.15757 -23.09 0.0000
X1           1.13778 0.24287   4.68 0.0000
MORALG1      0.16551 0.12671   1.31 0.1915
MORALG2      1.35204 0.09074  14.90 0.0000
MORALG3     -0.21237 0.13418  -1.58 0.1135
CARESG1     -0.82327 0.21057  -3.91 0.0001
CARESG2      1.03761 0.09673  10.73 0.0000
CARESG3      0.44928 0.11499   3.91 0.0001
KNOWG1       0.51791 0.10565   4.90 0.0000
KNOWG2       1.59075 0.08528  18.65 0.0000
KNOWG3      -0.39963 0.14737  -2.71 0.0067
LEADG1      -0.75474 0.20225  -3.73 0.0002
LEADG2       1.10371 0.10844  10.18 0.0000
LEADG3       0.72692 0.11187   6.50 0.0000
DISHONG1    -1.11446 0.10390 -10.73 0.0000
DISHONG2    -0.08549 0.06760  -1.26 0.2060
DISHONG3     0.76925 0.05154  14.93 0.0000
X1:MORALG1  -1.54191 0.11903 -12.95 0.0000
X1:MORALG2  -0.53811 0.09313  -5.78 0.0000
X1:MORALG3   0.86393 0.13451   6.42 0.0000
X1:CARESG1  -1.75411 0.20513  -8.55 0.0000
X1:CARESG2  -0.60193 0.09661  -6.23 0.0000
X1:CARESG3   0.94726 0.10984   8.62 0.0000
X1:KNOWG1   -1.24453 0.10332 -12.05 0.0000
X1:KNOWG2   -0.37086 0.08936  -4.15 0.0000
X1:KNOWG3    0.80897 0.15394   5.25 0.0000
X1:LEADG1   -1.73558 0.20081  -8.64 0.0000
X1:LEADG2   -0.67058 0.10758  -6.23 0.0000
X1:LEADG3    0.74112 0.11295   6.56 0.0000
X1:DISHONG1  0.73923 0.10896   6.78 0.0000
X1:DISHONG2  0.37220 0.07144   5.21 0.0000
X1:DISHONG3 -0.22209 0.05838  -3.80 0.0001
what <- c(~X, ~MORALG | X, ~CARESG | X, ~KNOWG | X, 
          ~LEADG | X, ~DISHONG | X)
cvamEstimate(what, fit_cvam)
Estimates and SE's from EM, linearized
~ X
  X   prob     SE prob.lower prob.upper
1 1 0.4482 0.0218     0.4058     0.4912
2 2 0.5518 0.0218     0.5088     0.5942
~ MORALG | X
  X            MORALG   prob     SE prob.lower prob.upper
1 1  1 Extremely well 0.0473 0.0103     0.0307     0.0722
2 1      2 Quite well 0.4224 0.0236     0.3769     0.4693
3 1    3 Not too well 0.3591 0.0214     0.3184     0.4020
4 1 4 Not well at all 0.1712 0.0159     0.1423     0.2047
5 2  1 Extremely well 0.4392 0.0204     0.3996     0.4796
6 2      2 Quite well 0.5273 0.0194     0.4891     0.5651
7 2    3 Not too well 0.0271 0.0078     0.0154     0.0475
8 2 4 Not well at all 0.0064 0.0036     0.0021     0.0191
~ CARESG | X
  X            CARESG   prob     SE prob.lower prob.upper
1 1  1 Extremely well 0.0098 0.0051     0.0035     0.0272
2 1      2 Quite well 0.1990 0.0240     0.1561     0.2501
3 1    3 Not too well 0.5201 0.0227     0.4755     0.5644
4 1 4 Not well at all 0.2711 0.0197     0.2343     0.3114
5 2  1 Extremely well 0.3012 0.0182     0.2668     0.3379
6 2      2 Quite well 0.6117 0.0184     0.5752     0.6471
7 2    3 Not too well 0.0722 0.0134     0.0499     0.1032
8 2 4 Not well at all 0.0149 0.0054     0.0073     0.0303
~ KNOWG | X
  X             KNOWG   prob     SE prob.lower prob.upper
1 1  1 Extremely well 0.0836 0.0122     0.0627     0.1108
2 1      2 Quite well 0.5858 0.0210     0.5440     0.6263
3 1    3 Not too well 0.2604 0.0185     0.2258     0.2984
4 1 4 Not well at all 0.0701 0.0103     0.0525     0.0932
5 2  1 Extremely well 0.4375 0.0203     0.3982     0.4776
6 2      2 Quite well 0.5340 0.0193     0.4961     0.5715
7 2    3 Not too well 0.0224 0.0077     0.0114     0.0438
8 2 4 Not well at all 0.0061 0.0031     0.0022     0.0164
~ LEADG | X
  X             LEADG   prob     SE prob.lower prob.upper
1 1  1 Extremely well 0.0107 0.0052     0.0041     0.0276
2 1      2 Quite well 0.1985 0.0230     0.1573     0.2473
3 1    3 Not too well 0.5588 0.0223     0.5148     0.6019
4 1 4 Not well at all 0.2320 0.0184     0.1979     0.2701
5 2  1 Extremely well 0.2774 0.0177     0.2442     0.3133
6 2      2 Quite well 0.6133 0.0184     0.5766     0.6488
7 2    3 Not too well 0.1026 0.0153     0.0762     0.1367
8 2 4 Not well at all 0.0067 0.0037     0.0023     0.0197
~ DISHONG | X
  X           DISHONG   prob     SE prob.lower prob.upper
1 1  1 Extremely well 0.1569 0.0152     0.1294     0.1890
2 1      2 Quite well 0.3041 0.0195     0.2674     0.3436
3 1    3 Not too well 0.3946 0.0213     0.3537     0.4371
4 1 4 Not well at all 0.1443 0.0159     0.1159     0.1784
5 2  1 Extremely well 0.0217 0.0054     0.0132     0.0353
6 2      2 Quite well 0.0875 0.0111     0.0681     0.1119
7 2    3 Not too well 0.3728 0.0189     0.3367     0.4105
8 2 4 Not well at all 0.5179 0.0200     0.4787     0.5570

Now consider them “ordinal” using an adjacent-category logit model, a.k.a. “nominal-by-linear” model in Goodman’s loglinear terminology.

scale_01 <- function(x) {
  (x - min(x)) / (max(x) - min(x))
}

factor_to_numeric_01 <- function(x) {
  x_num <- seq_along(levels(x))[x]
  scale_01(x_num)
}

X <- fit_cvam$modelMatrix
X <- X %>% as_tibble() %>% 
  # Remove nominal interaction terms
  select_if(!grepl("X.:.+", names(.))) 
# Now include linear interaction terms by hand
X$`X1:MORALG` <- X$X1 * factor_to_numeric_01(fit_cvam$mfTrue$MORALG)
X$`X1:CARESG` <- X$X1 * factor_to_numeric_01(fit_cvam$mfTrue$CARESG)
X$`X1:KNOWG` <- X$X1 * factor_to_numeric_01(fit_cvam$mfTrue$KNOWG)
X$`X1:LEADG` <- X$X1 * factor_to_numeric_01(fit_cvam$mfTrue$LEADG)
X$`X1:DISHONG` <- X$X1 * factor_to_numeric_01(fit_cvam$mfTrue$DISHONG)

rmarkdown::paged_table(X)
X <- as.matrix(X)

We can now fit the model with ordinal indicators by passing our custom design matrix X to cvam.

fit_cvam_ord <- cvam(formula, data = df_freq, 
                     freq = Freq, modelMatrix = X, 
                     control = list(startValJitter = 0.1))
Note: Estimate at or near boundary
Estimated variances may be unreliable
summary(fit_cvam_ord)
~ X * (MORALG + CARESG + KNOWG + LEADG + DISHONG)

Prior:
      Flattening frequency = 0
Total nuggets + flattening = 0
              Ridge factor = 0
          Intensity factor = 1

Sample size:
              total N in supplied data = 1495
N from supplied data used in model fit = 1495
           prior effective sample size =    0

Degrees of freedom:
    patterns of coarsened data = 1024
  cells in complete-data table = 2048
cells without latent variables = 1024
         structural zero cells =    0
   parameters in Poisson model =   22
                            df = 1002

Starting values:
default, center
jitter SD = 0.100000

EM algorithm:
Converged at iteration 102
Gradient length = 0.000028

  Final logP = 1163.216
Final loglik = 1163.216

Estimates from EM, with Hessian-based SEs
                coef      SE  zstat   pval
(Intercept) -4.17560 0.14364 -29.07 0.0000
X1          -4.70561 0.23787 -19.78 0.0000
MORALG1      0.19249 0.07959   2.42 0.0156
MORALG2      1.48030 0.08811  16.80 0.0000
MORALG3      0.07357 0.06576   1.12 0.2632
CARESG1     -0.80496 0.10068  -8.00 0.0000
CARESG2      1.26607 0.09138  13.86 0.0000
CARESG3      0.74189 0.07953   9.33 0.0000
KNOWG1       0.60969 0.07515   8.11 0.0000
KNOWG2       1.68829 0.07947  21.25 0.0000
KNOWG3      -0.10932 0.07033  -1.55 0.1201
LEADG1      -0.82379 0.09712  -8.48 0.0000
LEADG2       1.22339 0.07733  15.82 0.0000
LEADG3       0.86062 0.07474  11.51 0.0000
DISHONG1    -1.22892 0.08710 -14.11 0.0000
DISHONG2    -0.04606 0.05561  -0.83 0.4076
DISHONG3     0.78925 0.05224  15.11 0.0000
X1:MORALG    3.46023 0.25312  13.67 0.0000
X1:CARESG    4.15955 0.29922  13.90 0.0000
X1:KNOWG     2.70613 0.21085  12.83 0.0000
X1:LEADG     3.85675 0.25308  15.24 0.0000
X1:DISHONG  -1.75815 0.12500 -14.07 0.0000
cvamEstimate(what, fit_cvam_ord)
Estimates and SE's from EM, linearized
~ X
  X   prob     SE prob.lower prob.upper
1 1 0.4492 0.0204     0.4097     0.4893
2 2 0.5508 0.0204     0.5107     0.5903
~ MORALG | X
  X            MORALG   prob     SE prob.lower prob.upper
1 1  1 Extremely well 0.0385 0.0071     0.0267     0.0552
2 1      2 Quite well 0.4421 0.0201     0.4032     0.4817
3 1    3 Not too well 0.3432 0.0196     0.3059     0.3825
4 1 4 Not well at all 0.1762 0.0158     0.1474     0.2092
5 2  1 Extremely well 0.4471 0.0199     0.4086     0.4862
6 2      2 Quite well 0.5114 0.0172     0.4777     0.5449
7 2    3 Not too well 0.0395 0.0069     0.0279     0.0556
8 2 4 Not well at all 0.0020 0.0007     0.0010     0.0040
~ CARESG | X
  X            CARESG   prob     SE prob.lower prob.upper
1 1  1 Extremely well 0.0066 0.0019     0.0037     0.0117
2 1      2 Quite well 0.2103 0.0220     0.1704     0.2567
3 1    3 Not too well 0.4981 0.0212     0.4566     0.5396
4 1 4 Not well at all 0.2850 0.0197     0.2479     0.3252
5 2  1 Extremely well 0.3043 0.0178     0.2705     0.3403
6 2      2 Quite well 0.6033 0.0177     0.5681     0.6374
7 2    3 Not too well 0.0893 0.0136     0.0659     0.1198
8 2 4 Not well at all 0.0032 0.0011     0.0016     0.0064
~ KNOWG | X
  X             KNOWG   prob     SE prob.lower prob.upper
1 1  1 Extremely well 0.0825 0.0105     0.0641     0.1055
2 1      2 Quite well 0.5980 0.0166     0.5650     0.6301
3 1    3 Not too well 0.2442 0.0166     0.2132     0.2781
4 1 4 Not well at all 0.0752 0.0103     0.0574     0.0981
5 2  1 Extremely well 0.4391 0.0195     0.4014     0.4776
6 2      2 Quite well 0.5239 0.0165     0.4915     0.5561
7 2    3 Not too well 0.0352 0.0058     0.0255     0.0484
8 2 4 Not well at all 0.0018 0.0006     0.0009     0.0034
~ LEADG | X
  X             LEADG   prob     SE prob.lower prob.upper
1 1  1 Extremely well 0.0077 0.0020     0.0046     0.0129
2 1      2 Quite well 0.2154 0.0210     0.1771     0.2593
3 1    3 Not too well 0.5419 0.0209     0.5007     0.5825
4 1 4 Not well at all 0.2351 0.0180     0.2017     0.2721
5 2  1 Extremely well 0.2803 0.0171     0.2481     0.3150
6 2      2 Quite well 0.6004 0.0174     0.5659     0.6338
7 2    3 Not too well 0.1155 0.0140     0.0908     0.1459
8 2 4 Not well at all 0.0038 0.0011     0.0022     0.0068
~ DISHONG | X
  X           DISHONG   prob     SE prob.lower prob.upper
1 1  1 Extremely well 0.1638 0.0144     0.1375     0.1940
2 1      2 Quite well 0.2976 0.0166     0.2660     0.3311
3 1    3 Not too well 0.3818 0.0142     0.3543     0.4100
4 1 4 Not well at all 0.1568 0.0138     0.1316     0.1859
5 2  1 Extremely well 0.0158 0.0029     0.0110     0.0226
6 2      2 Quite well 0.0925 0.0087     0.0768     0.1110
7 2    3 Not too well 0.3833 0.0137     0.3568     0.4105
8 2 4 Not well at all 0.5084 0.0180     0.4732     0.5436

The ordinal model is more parsimonious, using 10 fewer parameters.

anova(fit_cvam_ord, fit_cvam)
Model 1: ~ X * (MORALG + CARESG + KNOWG + LEADG + DISHONG)
Model 2: ~ X * (MORALG + CARESG + KNOWG + LEADG + DISHONG)
  resid.df -2*loglik df change
1     1002   -2326.4          
2      992   -2356.8 10 30.316
anova(fit_cvam_ord, fit_cvam, method = "BIC")
Model 1: ~ X * (MORALG + CARESG + KNOWG + LEADG + DISHONG)
Model 2: ~ X * (MORALG + CARESG + KNOWG + LEADG + DISHONG)
  resid.df -2*loglik     BIC rank
1     1002   -2326.4 -2165.6    1
2      992   -2356.8 -2122.8    2

Linear effects among observed and latent variables (DFactor models)

Missing-not-at-random (MNAR) / Unseen data-dependent models

Doing everything by hand

options(contrasts = rep("contr.sum", 2))

varnames <- LETTERS[1:5]
sum_vars <- paste(varnames, collapse = "+")

formula <- sprintf("Freq ~ X * (%s)", sum_vars)
# formula <- sprintf("Freq ~ X * (%s) + B:E + G:A + D:F + B:G + G:D", sum_vars)

df_freq <- carcinoma[, varnames] |> table() |> as.data.frame()
n <- nrow(df_freq)
df_freq$patnum <- 1:n

df_expanded <- rbind(df_freq, df_freq)
df_expanded$X <- rep(0:1, each = n)
post_start <- runif(n)
df_expanded$post <- c(post_start, 1-post_start)

suppressWarnings( # LCAs often have boundary values within classes
  for(i in 1:200) {
    # M-step
    # Loglinear model
    fit_glm <- glm(formula, 
                   data = df_expanded, weights = post, 
                   family = "poisson")
    
    # E-step
    eta.X <- predict(fit_glm) # Linear predictor given X
    eta.X <- matrix(eta.X, nrow = n) |> t()
    n.X <- sum(exp(eta.X)) # Sample size given X 
    P_YX <- exp(eta.X)/n.X # Probability of each pattern, joint (X, Y)
    
    P_Y <- colSums(P_YX)
    P_X.Y <- t(P_YX) / P_Y # Posterior of X given Y
    
    df_expanded$post <- as.vector(P_X.Y)
})

summary(fit_glm)

Call:
glm(formula = formula, family = "poisson", data = df_expanded, 
    weights = post)

Deviance Residuals: 
     Min        1Q    Median        3Q       Max  
-2.27223  -0.00021   0.00000   0.00006   2.32773  

Coefficients:
             Estimate Std. Error z value Pr(>|z|)    
(Intercept)  -20.7518  3878.9737  -0.005   0.9957    
X              9.5193  3997.5153   0.002   0.9981    
A1             1.0887     0.2304   4.726 2.29e-06 ***
B1             0.3282     0.1401   2.343   0.0191 *  
C1            11.0176  2854.0327   0.004   0.9969    
D1            10.9489  2626.9629   0.004   0.9967    
E1             0.6686     0.1651   4.050 5.11e-05 ***
X:A1         -11.0505   966.2770  -0.011   0.9909    
X:B1          -2.3191     0.5229  -4.435 9.22e-06 ***
X:C1         -11.5198  2854.0327  -0.004   0.9968    
X:D1         -10.9899  2626.9629  -0.004   0.9967    
X:E1          -2.5160     0.4704  -5.349 8.83e-08 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for poisson family taken to be 1)

    Null deviance: 346.375  on 63  degrees of freedom
Residual deviance:  31.418  on 52  degrees of freedom
AIC: 98.345

Number of Fisher Scoring iterations: 17
P_X <- rowSums(P_YX) # Prior
P_Y.X <- P_YX/P_X    # Conditional of Y given X

P_Y.X_marginal <- sapply(varnames, function(v) {
  apply(P_Y.X, 1, function(x) tapply(x, df_freq[, v], sum))
})

P_Y.X_marginal |> round(3)
         A     B     C    D     E
[1,] 0.898 0.658 1.000 1.00 0.792
[2,] 0.102 0.342 0.000 0.00 0.208
[3,] 0.000 0.018 0.268 0.48 0.024
[4,] 1.000 0.982 0.732 0.52 0.976
# fit_polca$probs |> lapply(round, 3)

N <- sum(df_freq$Freq)

expected <- P_Y * N
observed <- df_freq$Freq

loglik <- sum(observed * log(P_Y))
deviance <- -2*loglik
p <- length(coef(fit_glm))
bic_deviance <- deviance + p*log(N)
bic_deviance
[1] 556.776
X2 <- sum((observed - expected)^2/expected)
G2 <- 2*sum(observed * log(observed/expected), na.rm = TRUE)

data.frame(Chisq=c(X2, fit_polca$Chisq), Gsqp=c(G2, fit_polca$Gsq), method=c("loglinear", "poLCA"))
     Chisq     Gsqp    method
1 27.48417 27.98881 loglinear
2 92.64814 62.36543     poLCA

Alternative R implementation

This does appear to work as well. It requires more hand-holding than cvam.

library(gllm)

options(contrasts = rep("contr.treatment", 2))

K <- 3

df_expanded <- Reduce(rbind, replicate(K, df_freq, simplify = FALSE))
df_expanded$X <- factor(rep(seq(K), each = n))

formula_gllm <- ~ X * (A + B + C + D + E)
X <- model.matrix(formula_gllm, data = df_expanded)
s <- replicate(K, 1:nrow(df_freq))

res <- emgllm(y = df_freq$Freq, s = s, X = X)
res
$deviance
[1] 28.16023

$observed.values
 [1] 34  2  7  3  0  0  0  0  0  0  0  1  0  0  0  0  2  0  9 10  0  1  0 18  0
[26]  0  0  5  0  0  0 26

$fitted.values
 [1] 2.755131e+01 2.760242e+00 1.386034e+01 1.558156e+00 1.409730e-02
 [6] 9.989355e-03 8.454280e-03 4.817056e-01 1.084389e-02 4.451493e-03
[11] 5.989757e-03 1.892922e-01 3.258625e-05 9.464652e-03 1.519161e-03
[16] 5.304967e-01 6.896560e+00 8.201101e-01 3.489994e+00 7.632804e+00
[21] 4.566682e-03 3.656668e-01 6.030415e-02 2.047734e+01 3.121613e-03
[26] 1.436234e-01 2.433266e-02 8.035526e+00 1.153064e-03 4.029088e-01
[31] 6.455638e-02 2.258447e+01

$full.table
 [1] 2.755131e+01 2.757187e+00 1.385986e+01 1.387020e+00 1.407245e-02
 [6] 1.408294e-03 7.079230e-03 7.084511e-04 1.083403e-02 1.084211e-03
[11] 5.450125e-03 5.454192e-04 5.533722e-06 5.537851e-07 2.783773e-06
[16] 2.785850e-07 6.896190e+00 6.901336e-01 3.469171e+00 3.471759e-01
[21] 3.522383e-03 3.525011e-04 1.771956e-03 1.773278e-04 2.711796e-03
[26] 2.713819e-04 1.364186e-03 1.365203e-04 1.385110e-06 1.386144e-07
[31] 6.967882e-07 6.973081e-08 4.363061e-06 1.526518e-03 2.445662e-04
[36] 8.556714e-02 1.226294e-05 4.290473e-03 6.873842e-04 2.404973e-01
[41] 4.812064e-06 1.683612e-03 2.697345e-04 9.437287e-02 1.352492e-05
[46] 4.732006e-03 7.581231e-04 2.652469e-01 1.857467e-04 6.498778e-02
[51] 1.041181e-02 3.642812e+00 5.220646e-04 1.826564e-01 2.926371e-02
[56] 1.023859e+01 2.048619e-04 7.167568e-02 1.148329e-02 4.017695e+00
[61] 5.757904e-04 2.014536e-01 3.227524e-02 1.129224e+01 4.363061e-06
[66] 1.526518e-03 2.445662e-04 8.556714e-02 1.226294e-05 4.290473e-03
[71] 6.873842e-04 2.404973e-01 4.812064e-06 1.683612e-03 2.697345e-04
[76] 9.437287e-02 1.352492e-05 4.732006e-03 7.581231e-04 2.652469e-01
[81] 1.857467e-04 6.498778e-02 1.041181e-02 3.642812e+00 5.220646e-04
[86] 1.826564e-01 2.926371e-02 1.023859e+01 2.048619e-04 7.167568e-02
[91] 1.148329e-02 4.017695e+00 5.757904e-04 2.014536e-01 3.227524e-02
[96] 1.129224e+01
with(res, sum((fitted.values - observed.values)^2/fitted.values))
[1] 28.49572
with(res, 2*sum(observed.values * log(observed.values/fitted.values), na.rm = TRUE))
[1] 28.16023