Background/Objectives: Earlier detection of severe immune-related hematological adverse events (irHAEs) in cancer patients treated with a PD-1 or PD-L1 inhibitor is critical to improving treatment outcomes. The study aimed to develop a simple machine learning (ML) model for predicting irHAEs associated with PD-1/PD-L1 inhibitors. Methods: We utilized the Observational Medical Outcomes Partnership-Common Data Model based on electronic medical records from a tertiary (KHMC) and a secondary (KHNMC) hospital in South Korea. Severe irHAEs were defined as Grades 3-5 by the Common Terminology Criteria for Adverse Events (version 5.0). The predictive model was developed using the KHMC dataset, and then cross-validated against an independent cohort (KHNMC). The full ML models were then simplified by selecting critical features based on the feature importance values (FIVs). Results: Overall, 397 and 255 patients were included in the primary (KHMC) and cross-validation (KHNMC) cohort, respectively. Among the tested ML algorithms, random forest achieved the highest accuracy (area under the receiver operating characteristic curve [AUROC] 0.88 for both cohorts). Parsimonious models reduced to 50% FIVs of the full models showed comparable performance to the full models (AUROC 0.83-0.86, p > 0.05). The KHMC and KHNMC parsimonious models shared common predictive features including furosemide, oxygen gas, piperacillin/tazobactam, and acetylcysteine. Conclusions: Considering the simplicity and adequate predictive performance, our simplified ML models might be easily implemented in clinical practice with broad applicability. Our model might enhance early diagnostic screening of irHAEs induced by PD-1/PD-L1 inhibitors, contributing to minimizing the risk of severe irHAEs and improving the effectiveness of cancer immunotherapy.
Keywords: PD-1 inhibitor; PD-L1 inhibitor; common data model (CDM); immune checkpoint inhibitor; immune-related hematological adverse events; machine learning; observational medical outcomes partnership (OMOP); parsimonious model; pharmacovigilance; real-world data (RWD); risk prediction.