Curating and Integrating Data from Multiple Sources to Support Healthcare Analytics

Stud Health Technol Inform. 2015:216:1056.

Abstract

As the volume and variety of healthcare related data continues to grow, the analysis and use of this data will increasingly depend on the ability to appropriately collect, curate and integrate disparate data from many different sources. We describe our approach to and highlight our experiences with the development of a robust data collection, curation and integration infrastructure that supports healthcare analytics. This system has been successfully applied to the processing of a variety of data types including clinical data from electronic health records and observational studies, genomic data, microbiomic data, self-reported data from surveys and self-tracked data from wearable devices from over 600 subjects. The curated data is currently being used to support healthcare analytic applications such as data visualization, patient stratification and predictive modeling.

MeSH terms

  • Data Accuracy*
  • Electronic Health Records / organization & administration*
  • Health Services Research / organization & administration*
  • Information Storage and Retrieval / methods*
  • Medical Record Linkage / methods*
  • Models, Organizational*
  • Systems Integration
  • United States