A fast Monte Carlo EM algorithm for estimation in latent class model analysis with an application to assess diagnostic accuracy for cervical neoplasia in women with AGC

J Appl Stat. 2013;40(12):2699-2719. doi: 10.1080/02664763.2013.825704.

Abstract

In this article we use a latent class model (LCM) with prevalence modeled as a function of covariates to assess diagnostic test accuracy in situations where the true disease status is not observed, but observations on three or more conditionally independent diagnostic tests are available. A fast Monte Carlo EM (MCEM) algorithm with binary (disease) diagnostic data is implemented to estimate parameters of interest; namely, sensitivity, specificity, and prevalence of the disease as a function of covariates. To obtain standard errors for confidence interval construction of estimated parameters, the missing information principle is applied to adjust information matrix estimates. We compare the adjusted information matrix based standard error estimates with the bootstrap standard error estimates both obtained using the fast MCEM algorithm through an extensive Monte Carlo study. Simulation demonstrates that the adjusted information matrix approach estimates the standard error similarly with the bootstrap methods under certain scenarios. The bootstrap percentile intervals have satisfactory coverage probabilities. We then apply the LCM analysis to a real data set of 122 subjects from a Gynecologic Oncology Group (GOG) study of significant cervical lesion (S-CL) diagnosis in women with atypical glandular cells of undetermined significance (AGC) to compare the diagnostic accuracy of a histology-based evaluation, a CA-IX biomarker-based test and a human papillomavirus (HPV) DNA test.

Keywords: MCEM estimation; adjusted information matrix; bootstrap standard errors; diagnostic accuracy; imperfect gold standard; latent class model.