Visual grids for managing data completeness in clinical research datasets

J Biomed Inform. 2015 Apr:54:337-44. doi: 10.1016/j.jbi.2014.12.002. Epub 2014 Dec 30.

Abstract

Missing data arise in clinical research datasets for reasons ranging from incomplete electronic health records to incorrect trial data collection. This has an adverse effect on analysis performed with the data, but it can also affect the management of a clinical trial itself. We propose two graphical visualization schemes to aid in managing the completeness of a clinical research dataset: the binary completeness grid (BCG) for single patient observation, and the gradient completeness grid (GCG) for an entire dataset. We use these tools to manage three clinical trials. Two are ongoing observational trials, while the other is a cohort study that is complete. The completeness grids revealed unexpected patterns in our data and enabled us to identify records that should have been purged and identify missing follow-up data from sets of observations thought to be complete. Binary and gradient completeness grids provide a rapid, convenient way to visualize missing data in clinical datasets.

Keywords: Clinical trial data; Data completeness; Data visualization; Missing data.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Biomedical Research / methods*
  • Biomedical Research / standards
  • Clinical Studies as Topic / methods
  • Clinical Studies as Topic / standards
  • Data Collection / methods*
  • Data Collection / standards
  • Electronic Health Records
  • Humans