Background: Electronic health records (EHRs) are an important source of information with regard to diagnosis and treatment of rare health conditions, such as congenital hemophilia, a bleeding disorder characterized by deficiency of factor VIII (FVIII) or factor IX (FIX).
Objective: To identify patients with congenital hemophilia using EHRs.
Design: An EHR database study.
Setting: EHRs were accessed from Humedica between January 1, 2007, and July 31, 2013.
Patients: Selection criteria were applied for an initial ICD-9-CM diagnosis of 286.0 (hemophilia A) or 286.1 (hemophilia B), and confirmation of records 6 months before and 12 months after the first diagnosis. Additional selection criteria included mention of "hemophilia" and "blood" or "bleed" within physician notes identified via natural language processing.
Results: A total of 129 males and 35 females were identified as the analysis population. Of those patients for whom both prothrombin time and activated partial thromboplastin time test results were available, only 56% of males and 7% of females exhibited a pattern of test results consistent with congenital hemophilia (normal prothrombin time and prolonged activated partial thromboplastin time). Few patients had a prescription for a hemophilia treatment; males most commonly received Amicar (10.8%) or FVIII (9.0%), whereas females most commonly received DDAVP (11.0%). The most identifiable sites of pain were the chest and the abdomen; 41% of males and 37% of females had joint pain. To evaluate whether patients had been correctly identified with congenital hemophilia, EHRs of 6 patients were reviewed; detailed assessment of their data was found to be inconsistent with a conclusive diagnosis of congenital hemophilia.
Limitations: Inconsistent coding practices may affect data integrity.
Conclusion: A potentially high number of false positive identifications, particularly among female patients, suggests that ICD-9-CM coding alone may be insufficient to identify patient cohorts. In-depth reviews and multimodal analysis of chart notes may improve data integrity.
Keywords: big data; congenital hemophilia; database; electronic health record.