Analysis of counts with two latent classes, with application to risk assessment based on physician-visit records of cancer survivors

Biostatistics. 2014 Apr;15(2):384-97. doi: 10.1093/biostatistics/kxt052. Epub 2013 Dec 1.

Abstract

Motivated by a cancer survivorship program, this paper explores event counts from two categories of individuals with unobservable membership. We formulate the counts using a latent class model and consider two likelihood-based inference procedures, the maximum likelihood estimation (MLE) and a pseudo-MLE procedure. The pseudo-MLE utilizes additional information on one of the latent classes. It yields reduced computational intensity and potentially increased estimation efficiency. We establish the consistency and asymptotic normality of the proposed pseudo-MLE, and we present an extended Huber sandwich estimator as a robust variance estimator for the pseudo-MLE. The finite-sample properties of the two-parameter estimators along with their variance estimators are examined by simulation. The proposed methodology is illustrated by physician-claim data from the cancer program.

Keywords: Efficiency vs. robustness; Mixture Poisson model; Pseudo-maximum likelihood estimation; Robust variance estimation; Supplementary information.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Data Interpretation, Statistical*
  • Humans
  • Insurance Claim Review / statistics & numerical data
  • Likelihood Functions
  • Models, Statistical*
  • Neoplasms / epidemiology
  • Office Visits / statistics & numerical data
  • Poisson Distribution
  • Risk Assessment
  • Survivors / statistics & numerical data