[A pretest model of obstructive coronary artery disease based on machine learning: from the C-Strat study]

Zhonghua Nei Ke Za Zhi. 2022 Feb 1;61(2):185-192. doi: 10.3760/cma.j.cn112138-20210119-00049.
[Article in Chinese]

Abstract

Objective: To develop a pretest probability model of obstructive coronary artery disease with machine learning based on multi-site Chinese population data. Methods: Chinese regiStry in early deTection and Risk strAtificaTion of coronary plaques (C-Strat) study is a prospective multi-center cohort study, in which consecutive patients with suspected obstructive coronary artery disease and ≥64 detector row coronary computed tomography angioplasty (CCTA) evaluation were included. Data from the patients were randomly split into a training set (70%) and a test set (30%). More than 50% of coronary artery stenosis by CCTA was defined as positive outcome. A boosted ensemble algorithm (XGBoost), 10-fold cross-validation and Bayesian optimization were used to establish a new prediction model-CARDIACS(pretest probability model from Chinese registry in eARly Detection and rIsk stratificAtion of Coronary plaques Study), and a logistic regression was used to establish a model-LOGISTIC in training set. The test set was used for validation and comparison among CARDIACS, LOGISTIC, UDFM (updated Diamond-Forrester Model) and DFCASS(Diamond-Forrester and CASS). Results: The study population included 29 455 patients with age of (57.0±9.7) years and 44.8% women, of whom 19.1% (5 622/29 455) had obstructive coronary artery disease. For CARDIACS, the age, the reason for visit and the body mass index (BMI) were the most important predictive variables. In the independent test set, the area under the curve (AUC) of CARDIACS was 0.72 (95%CI 0.70-0.73), which was significantly superior to that of LOGISTIC (AUC 0.69, 95%CI 0.68-0.71, P=0.015), UDFM (AUC 0.64, 95%CI 0.62-0.65, P<0.001) and DFCASS (AUC 0.66, 95%CI 0.64-0.67, P<0.001), respectively. Conclusion: Based on Chinese population, the study developed a new pretest probability model--CARDIACS, which was superior to the traditional models. CARDIACS is expected to assist in the clinical decision-making for patients with stable chest pain.

目的: 利用机器学习算法开发中国人群的阻塞性冠心病验前概率模型。 方法: 纳入冠状动脉斑块早期识别与风险预警的临床注册研究(Chinese regiStry in early deTection and Risk strAtificaTion of coronary plaques,C-Strat)中疑似为冠心病而接受冠状动脉CT血管造影(CCTA)检查的29 455例就诊者,采集人口统计学和临床信息作为预测变量。数据按7∶3的比例随机拆分为训练集和测试集,以CCTA诊断冠状动脉狭窄大于50%作为阳性结局,在训练集中运用极端梯度增强机(eXtreme Gradient Boosting,XGBoost)算法,使用十折交叉验证和贝叶斯优化进行参数调优,得到机器学习模型CARDIACS(pretest probability model from Chinese registry in eARly Detection and rIsk stratificAtion of Coronary plaques Study);使用logistic回归得到模型LOGISTIC。在测试集中验证比较CARDIACS、LOGISTIC和指南推荐的模型UDFM(Updated Diamond-Forrester Model)、DFCASS(Diamond-Forrester and CASS)。 结果: 29 455例就诊者年龄(57.0±9.7)岁,女性占44.8%,阻塞性冠心病的患病率为19.1%(5 622/29 455)。在CARDIACS模型中,就诊原因、年龄和体重指数是最重要的预测变量。在独立的测试集中,CARDIACS的曲线下面积(AUC)为0.72(95%CI 0.70~0.73),优于LOGISTIC(AUC 0.69,95%CI 0.68~0.71,P=0.015)、UDFM(AUC 0.64,95%CI 0.62~0.65,P<0.001)和DFCASS(AUC 0.66,95%CI 0.64~0.67,P<0.001)。 结论: 基于中国人群开发的全新的验前概率模型CARDIACS预测中国人群阻塞性冠心病的能力明显优于传统的模型,有望辅助稳定性胸痛临床决策。.

MeSH terms

  • Aged
  • Bayes Theorem
  • Cohort Studies
  • Computed Tomography Angiography
  • Coronary Angiography
  • Coronary Artery Disease* / diagnostic imaging
  • Female
  • Humans
  • Machine Learning
  • Male
  • Middle Aged
  • Predictive Value of Tests
  • Prospective Studies
  • Risk Assessment