Integrative multi-omics approach using random forest and artificial neural network models for early diagnosis and immune infiltration characterization in ischemic stroke

Front Neurol. 2024 Dec 4:15:1475582. doi: 10.3389/fneur.2024.1475582. eCollection 2024.

Abstract

Background: Ischemic stroke (IS) is a significant global health issue, causing high rates of morbidity, mortality, and disability. Since conventional Diagnosis methods for IS have several shortcomings. It is critical to create new Diagnosis models in order to enhance existing Diagnosis approaches.

Methods: We utilized gene expression data from the Gene Expression Omnibus (GEO) databases GSE16561 and GSE22255 to identify differentially expressed genes (DEGs) associated with IS. DEGs analysis using the Limma package, as well as GO and KEGG enrichment analyses, were performed. Furthermore, PPI networks were constructed using DEGs from the String database, and Random Forest models were utilized to screen key DEGs. Additionally, an artificial neural network model was developed for IS classification. Use the GSE58294 dataset to evaluate the effectiveness of the scoring model on healthy controls and ischemic stroke samples. The effectiveness of the scoring model was evaluated through AUC analysis, and CIBERSORT analysis was conducted to estimate the immune landscape and explore the correlation between gene expression and immune cell infiltration.

Results: A total of 26 significant DEGs associated with IS were identified. Metascape analysis revealed enriched biological processes and pathways related to IS. 10 key DEGs (ARG1, DUSP1, F13A1, NFIL3, CCR7, ADM, PTGS2, ID3, FAIM3, HLA-DQB1) were selected using Random Forest and artificial neural network models. The area under the ROC curve (AUC) for the IS classification model was found to be near 1, indicating its high accuracy. Additionally, the analysis of the immune landscape demonstrated elevated immune-related networks in IS patients compared to healthy controls.

Conclusion: The study uncovers the involvement of specific genes and immune cells in the pathogenesis of IS, suggesting their importance in understanding and potentially targeting the disease.

Keywords: artificial neural network; diagnosis model; differentially expressed genes; ischemic stroke; random forest.

Grants and funding

The author(s) declare that financial support was received for the research, authorship, and/or publication of this article. This work was funded by the Guangdong Province Traditional Chinese Medicine Administration Research Project (No. 20241335) and the 2024 Huizhou Hospital of Traditional Chinese Medicine Intramural Innovation Fund Project (No. 2023CXJJ006).