Analysis by categorizing or dichotomizing continuous variables is inadvisable: an example from the natural history of unruptured aneurysms

AJNR Am J Neuroradiol. 2011 Mar;32(3):437-40. doi: 10.3174/ajnr.A2425. Epub 2011 Feb 17.

Abstract

In medical research analyses, continuous variables are often converted into categoric variables by grouping values into ≥2 categories. The simplicity achieved by creating ≥2 artificial groups has a cost: Grouping may create rather than avoid problems. In particular, dichotomization leads to a considerable loss of power and incomplete correction for confounding factors. The use of data-derived "optimal" cut-points can lead to serious bias and should at least be tested on independent observations to assess their validity. Both problems are illustrated by the way the results of a registry on unruptured intracranial aneurysms are commonly used. Extreme caution should restrict the application of such results to clinical decision-making. Categorization of continuous data, especially dichotomization, is unnecessary for statistical analysis. Continuous explanatory variables should be left alone in statistical models.

Trial registration: ClinicalTrials.gov NCT00537134.

Publication types

  • Clinical Trial
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Aneurysm, Ruptured / diagnosis*
  • Aneurysm, Ruptured / epidemiology*
  • Bias*
  • Data Interpretation, Statistical*
  • Humans
  • Intracranial Aneurysm / diagnosis*
  • Intracranial Aneurysm / epidemiology*
  • Prevalence
  • Proportional Hazards Models*
  • Reproducibility of Results
  • Risk Assessment / methods
  • Risk Factors
  • Sensitivity and Specificity

Associated data

  • ClinicalTrials.gov/NCT00537134