Development of a diagnostic test based on multiple continuous biomarkers with an imperfect reference test

Stat Med. 2016 Feb 20;35(4):595-608. doi: 10.1002/sim.6733. Epub 2015 Sep 21.

Abstract

Ignoring the fact that the reference test used to establish the discriminative properties of a combination of diagnostic biomarkers is imperfect can lead to a biased estimate of the diagnostic accuracy of the combination. In this paper, we propose a Bayesian latent-class mixture model to select a combination of biomarkers that maximizes the area under the ROC curve (AUC), while taking into account the imperfect nature of the reference test. In particular, a method for specification of the prior for the mixture component parameters is developed that allows controlling the amount of prior information provided for the AUC. The properties of the model are evaluated by using a simulation study and an application to real data from Alzheimer's disease research. In the simulation study, 100 data sets are simulated for sample sizes ranging from 100 to 600 observations, with a varying correlation between biomarkers. The inclusion of an informative as well as a flat prior for the diagnostic accuracy of the reference test is investigated. In the real-data application, the proposed model was compared with the generally used logistic-regression model that ignores the imperfectness of the reference test. Conditional on the selected sample size and prior distributions, the simulation study results indicate satisfactory performance of the model-based estimates. In particular, the obtained average estimates for all parameters are close to the true values. For the real-data application, AUC estimates for the proposed model are substantially higher than those from the 'traditional' logistic-regression model.

Keywords: AUC; Bayesian estimation; Biomarkers; Latent-class mixture models.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Alzheimer Disease / diagnosis*
  • Area Under Curve
  • Bayes Theorem
  • Biomarkers / analysis*
  • Computer Simulation
  • Diagnostic Tests, Routine / statistics & numerical data*
  • Humans
  • Logistic Models
  • Models, Statistical*
  • ROC Curve

Substances

  • Biomarkers