Dealing with deficient and missing data

Prev Vet Med. 2015 Nov 1;122(1-2):221-8. doi: 10.1016/j.prevetmed.2015.04.006. Epub 2015 Apr 20.

Abstract

Disease control decisions require two types of data: data describing the disease frequency (incidence and prevalence) along with characteristics of the population and environment in which the disease occurs (hereafter called "descriptive data"); and, data for analytical studies (hereafter called "analytical data") documenting the effects of risk factors for the disease. Both may be either deficient or missing. Descriptive data may be completely missing if the disease is a new and unknown entity with no diagnostic procedures or if there has been no surveillance activity in the population of interest. Methods for dealing with this complete absence of data are limited, but the possible use of surrogate measures of disease will be discussed. More often, data are deficient because of limitations in diagnostic capabilities (imperfect sensitivity and specificity). Developments in methods for dealing with this form of information bias make this a more tractable problem. Deficiencies in analytical data leading to biased estimates of effects of risk factors are a common problem, and one which is increasingly being recognized, but options for correction of known or suspected biases are still limited. Data about risk factors may be completely missing if studies of risk factors have not been carried out. Alternatively, data for evaluation of risk factors may be available but have "item missingness" where some (or many) observations have some pieces of information missing. There has been tremendous development in the methods to deal with this problem of "item missingness" over the past decade, with multiple imputation being the most prominent method. The use of multiple imputation to deal with the problem of item missing data will be compared to the use of complete-case analysis, and limitations to the applicability of imputation will be presented.

Keywords: Analytical data; Bias; Descriptive data; Missing data; Multiple imputation; New disease.

MeSH terms

  • Animal Diseases / epidemiology*
  • Animals
  • Bias
  • Data Interpretation, Statistical*
  • Epidemiologic Research Design / veterinary*
  • Research Design
  • Sensitivity and Specificity