Data interchange using i2b2

J Am Med Inform Assoc. 2016 Sep;23(5):909-15. doi: 10.1093/jamia/ocv188. Epub 2016 Feb 5.

Abstract

Objective: Reinventing data extraction from electronic health records (EHRs) to meet new analytical needs is slow and expensive. However, each new data research network that wishes to support its own analytics tends to develop its own data model. Joining these different networks without new data extraction, transform, and load (ETL) processes can reduce the time and expense needed to participate. The Informatics for Integrating Biology and the Bedside (i2b2) project supports data network interoperability through an ontology-driven approach. We use i2b2 as a hub, to rapidly reconfigure data to meet new analytical requirements without new ETL programming.

Materials and methods: Our 12-site National Patient-Centered Clinical Research Network (PCORnet) Clinical Data Research Network (CDRN) uses i2b2 to query data. We developed a process to generate a PCORnet Common Data Model (CDM) physical database directly from existing i2b2 systems, thereby supporting PCORnet analytic queries without new ETL programming. This involved: a formalized process for representing i2b2 information models (the specification of data types and formats); an information model that represents CDM Version 1.0; and a program that generates CDM tables, driven by this information model. This approach is generalizable to any logical information model.

Results: Eight PCORnet CDRN sites have implemented this approach and generated a CDM database without a new ETL process from the EHR. This enables federated querying within the CDRN and compatibility with the national PCORnet Distributed Research Network.

Discussion: We have established a way to adapt i2b2 to new information models without requiring changes to the underlying data. Eight Scalable Collaborative Infrastructure for a Learning Health System sites vetted this methodology, resulting in a network that, at present, supports research on 10 million patients' data.

Conclusion: New analytical requirements can be quickly and cost-effectively supported by i2b2 without creating new data extraction processes from the EHR.

Keywords: PCORnet CDM; data integration; data models; informatics for integrating biology and the bedside; medical informatics; ontology-driven data representation; patient centered outcomes research institute.

Publication types

  • Research Support, Non-U.S. Gov't
  • Research Support, N.I.H., Extramural

MeSH terms

  • Biomedical Research*
  • Data Mining / methods*
  • Databases, Factual*
  • Electronic Health Records
  • Health Information Interoperability*
  • Humans
  • Information Dissemination*
  • Information Storage and Retrieval