Identification of crucial genes for predicting the risk of atherosclerosis with system lupus erythematosus based on comprehensive bioinformatics analysis and machine learning

Comput Biol Med. 2023 Jan:152:106388. doi: 10.1016/j.compbiomed.2022.106388. Epub 2022 Nov 30.

Abstract

Background: Systemic lupus erythematosus (SLE) has become a major public health problem over the years, and atherosclerosis (AS) is one of the main complications of SLE associated with serious cardiovascular consequences in this patient population. The present study aimed to identify potential biomarkers for SLE patients with AS.

Methods: Five microarray datasets (GSE50772, GSE81622, GSE100927, GSE28829, GSE37356) were downloaded from the NCBI Gene Expression Omnibus database. The Limma package was used to identify differentially expressed genes (DEGs) in AS. Weighted gene coexpression network analysis (WGCNA) was used to identify significant module genes associated with SLE. Functional enrichment analysis, protein-protein interaction (PPI) network construction, and machine learning algorithms (least absolute shrinkage and selection operator (Lasso, Support Vector Machine-Recursive Feature Elimination (SVM-RFE), and random forest) were applied to identify hub genes. Subsequently, we generated a nomogram and receiver operating characteristic curve (ROC) for predicting the risk of AS in SLE patients. Finally, immune cell infiltrations were analyzed, and Consensus Cluster Analysis was conducted based on Single Sample Gene Set Enrichment Analysis (ssGSEA) scores.

Results: Five hub genes (SPI1, MMP9, C1QA, CX3CR1, and MNDA) were identified and used to establish a nomogram that yielded a high predictive performance (area under the curve 0.900-0.981). Dysregulated immune cell infiltrations were found in AS, with positive correlations with the five hub genes. Consensus clustering showed that the optimal number of subtypes was 3. Compared to subtypes A and B, subtype C presented higher expression of the five hub genes, immune cell infiltration levels and immune checkpoint expression.

Conclusion: Our study systematically identified five candidate hub genes (SPI1, MMP9, C1QA, CX3CR1, MNDA) and established a nomogram that could predict the risk of AS with SLE using various bioinformatic analyses and machine learning algorithms. Our findings provide the foothold for future studies on potential crucial genes for AS in SLE patients. Additionally, the dysregulated immune cell proportions and immune checkpoint expressions in AS with SLE were identified.

Keywords: Atherosclerosis; Bioinformatics analysis; Immune infiltration; Machine learning; Potential crucial biomarker; Systemic lupus erythematosus.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Atherosclerosis* / genetics
  • Computational Biology
  • Humans
  • Lupus Erythematosus, Systemic* / genetics
  • Machine Learning
  • Matrix Metalloproteinase 9
  • Risk Factors

Substances

  • Matrix Metalloproteinase 9