Desiderata for a Synthetic Clinical Data Generator

Stud Health Technol Inform. 2021 May 27:281:68-72. doi: 10.3233/SHTI210122.

Abstract

The current movement in Medical Informatics towards comprehensive Electronic Health Records (EHRs) has enabled a wide range of secondary use cases for this data. However, due to a number of well-justified concerns and barriers, especially with regards to information privacy, access to real medical records by researchers is often not possible, and indeed not always required. An appealing alternative to the use of real patient data is the employment of a generator for realistic, yet synthetic, EHRs. However, we have identified a number of shortcomings in prior works, especially with regards to the adaptability of the projects to the requirements of the German healthcare system. Based on three case studies, we define a non-exhaustive list of requirements for an ideal generator project that can be used in a wide range of localities and settings, to address and enable future work in this regard.

Keywords: RS-EHR; Realistic Synthetic Electronic Health Records; Requirements; Secondary Use; Synthetic Data.

MeSH terms

  • Electronic Health Records*
  • Humans
  • Medical Informatics*
  • Privacy