Point-Based Prediction Model for Bladder Cancer Risk in Diabetes: A Random Survival Forest-Guided Approach

J Clin Med. 2024 Dec 24;14(1):4. doi: 10.3390/jcm14010004.

Abstract

Background: Previous epidemiological studies have shown that diabetes is associated with an increased risk of several cancers, including bladder cancer. However, prediction models for bladder cancer among diabetes patients remain scarce. This study aims to develop a scoring system for bladder cancer risk prediction among diabetes patients who receive routine care in general outpatient clinics using a machine learning-guided approach. Methods: A territory-wide retrospective cohort study was conducted using electronic health records of Hong Kong. Patients who received diabetes care in public general outpatient clinics between 2010 and 2019 without a history of malignancy were identified and followed up until December 2019. To develop a scoring system for bladder cancer risk prediction, random survival forest was employed to guide variable selection, and Cox regression was subsequently applied for weight assignment. Results: Of the 382,770 patients identified, 644 patients developed bladder cancer during follow-up (median: 6.2 years). The incidence rate was 0.29 per 1000 person-years. In the final time-to-event scoring system, age, serum creatinine, sex, and smoking were included as predictors. Serum creatinine ≥94 µmol/L appeared to be associated with an increased risk of developing bladder cancer. The 2-year and 5-year AUCs on test set were 0.88 (95%CI: 0.84-0.92) and 0.86 (95%CI: 0.80-0.92) respectively. Conclusions: Renal dysfunction could be a potential predictor of bladder cancer among diabetes patients. The proposed scoring system could be potentially useful for providing individualized risk prediction among diabetes patients.

Keywords: bladder cancer; diabetes; random forest; risk prediction; survival analysis.