A low-cost machine learning-based cardiovascular/stroke risk assessment system: integration of conventional factors with image phenotypes

Cardiovasc Diagn Ther. 2019 Oct;9(5):420-430. doi: 10.21037/cdt.2019.09.03.

Abstract

Background: Most cardiovascular (CV)/stroke risk calculators using the integration of carotid ultrasound image-based phenotypes (CUSIP) with conventional risk factors (CRF) have shown improved risk stratification compared with either method. However such approaches have not yet leveraged the potential of machine learning (ML). Most intelligent ML strategies use follow-ups for the endpoints but are costly and time-intensive. We introduce an integrated ML system using stenosis as an endpoint for training and determine whether such a system can lead to superior performance compared with the conventional ML system.

Methods: The ML-based algorithm consists of an offline and online system. The offline system extracts 47 features which comprised of 13 CRF and 34 CUSIP. Principal component analysis (PCA) was used to select the most significant features. These offline features were then trained using the event-equivalent gold standard (consisting of percentage stenosis) using a random forest (RF) classifier framework to generate training coefficients. The online system then transforms the PCA-based test features using offline trained coefficients to predict the risk labels on test subjects. The above ML system determines the area under the curve (AUC) using a 10-fold cross-validation paradigm. The above system so-called "AtheroRisk-Integrated" was compared against "AtheroRisk-Conventional", where only 13 CRF were considered in a feature set.

Results: Left and right common carotid arteries of 202 Japanese patients (Toho University, Japan) were retrospectively examined to obtain 395 ultrasound scans. AtheroRisk-Integrated system [AUC =0.80, P<0.0001, 95% confidence interval (CI): 0.77 to 0.84] showed an improvement of ~18% against AtheroRisk-Conventional ML (AUC =0.68, P<0.0001, 95% CI: 0.64 to 0.72).

Conclusions: ML-based integrated model with the event-equivalent gold standard as percentage stenosis is powerful and offers low cost and high performance CV/stroke risk assessment.

Keywords: 10-year risk; Atherosclerosis; cardiovascular disease (CVD); carotid intima-media thickness (cIMT); carotid stenosis; carotid ultrasound (CUS); conventional risk factors (CRF); machine learning (ML); stroke.