Objectives: Existing methods for measuring adverse events in hospitals intercept a restricted number of events. Text mining refers to a range of techniques to extract data from narrative sources. The goal of this study was to evaluate the performance of an automated approach for extracting adverse event keywords from within electronic health records.
Methods: The study involved 4 medical centers in the Region of Lombardy. A starting set of keywords was trained in an iterative process to develop queries for 7 adverse events, including those used by the Agency for Healthcare Research and Quality as patient safety indicators. We calculated positive predictive values of the 7 queries and performed an error analysis to detect reasons for false-positive cases of pulmonary embolism, deep vein thrombosis, and urinary tract infection.
Results: Overall, 397,233 records were collected (34,805 discharge summaries, 292,593 emergency department notes, and 69,835 operation reports). Positive predictive values were higher for postoperative wound dehiscence (83.83%) and urinary tract infection (73.07%), whereas they were lower for deep vein thrombosis (5.37%), pulmonary embolism (13.63%), and postoperative sepsis (12.28%). The most common reasons for false positives were reporting of past events (42.25%), negations (22.80%), and conditions suspected by physicians but not confirmed by a diagnostic test (11.25%).
Conclusions: The results of our study demonstrated the feasibility of using an automated approach to detect multiple adverse events in several data sources. More sophisticated techniques, such as natural language processing, should be tested to evaluate the feasibility of using text mining as a routine method for monitoring adverse events in hospitals.
Copyright © 2020 Wolters Kluwer Health, Inc. All rights reserved.