Comparison between statistical and machine learning methods to detect the hematological indices with the greatest influence on elevated serum levels of low-density lipoprotein cholesterol

Chem Phys Lipids. 2024 Nov:265:105446. doi: 10.1016/j.chemphyslip.2024.105446. Epub 2024 Oct 5.

Abstract

Introduction: Elevated levels of low-density lipoprotein-cholesterol (LDL-C) is a significant risk factor for the development of cardiovascular diseases (CVD)s. Furthermore, studies have revealed an association between indices of the complete blood count (CBC) and dyslipidemia. We aimed to investigate the relationship between CBC parameters and serum levels of LDL.

Method: In a prospective study involving 9704 participants aged 35-65 years, comprehensive screening was conducted to estimate LDL-C levels and CBC indicators. The association between these biomarkers and high LDL-C (LDL-C≥130 mg/dL (3.25 mmol/L)) was investigated using various analytical methods, including Logistic Regression (LR), Decision Tree (DT), Random Forest (RF), Neural Network (NN), and Support Vector Machine (SVM) methodologies.

Result: The present study found that age, hemoglobin (HGB), hematocrit (HCT), platelet count (PLT), lymphocyte (LYM), PLT-LYM ratio (PLR), PLT-High-Density Lipoprotein (HDL) ratio (PHR), HGB-LYM ratio (HLR), red blood cell count (RBC), Neutrophil-HDL ratio (NHR), and PLT-RBC ratio (PRR) were all statistically significant between the two groups (p<0.05). Another important finding was that red cell distribution width (RDW) was a significant predictor for higher LDL levels in women. Furthermore, in men, RDW-PLT ratio (RPR) and PHR were the most important indicators for assessing the elevated LDL levels.

Conclusion: The study found that sex increases LDL-C odds in females by 52.9 %, while age and HCT increase it by 4.1 % and 5.5 %, respectively. RPR and PHR were the most influential variables for both genders. Elevated RPR and PHR were negatively correlated with increased LDL levels in men, and RDW levels was a statistically significant factor for women. Moreover, RDW was a significant factor in women for high levels of HDL-C. The study revealed that females have higher LDL-C levels (16 % compared to 14 % of males), with significant differences across variables like age, HGB, HCT, PLT, RLR, PHR, RBC, LYM, NHR, RPR, and key factors like RDW and SII.

Keywords: Cardiovascular disease (CVD); Decision tree; Low-density lipoprotein (LDL); Neural network; Random forest; Support vector machine.

Publication types

  • Comparative Study

MeSH terms

  • Adult
  • Aged
  • Blood Cell Count
  • Cholesterol, LDL* / blood
  • Female
  • Humans
  • Machine Learning*
  • Male
  • Middle Aged
  • Prospective Studies

Substances

  • Cholesterol, LDL