Background: Long COVID, a heterogeneous condition characterized by a range of physical and neuropsychiatric presentations, can be presented with a proportion of COVID-19-infected individuals.
Methods: Transcriptomic data sets of those within gene expression profiles of COVID-19, long COVID, and healthy controls were retrieved from the GEO database. Differentially expressed genes (DEGs) falling under COVID-19 and long COVID were identified with R packages, and contemporaneously conducted module detection was performed with the Modular Pharmacology Platform (http://112.86.129.72:48081/). The integration of both DEGs and differentially expressed module-genes (DEMGs) regarding long COVID and COVID-19 was intersected by following Gene Ontology (GO), Kyoto Encyclopedia of Genes and Genomes (KEGG), and Gene Set Enrichment Analysis (GSEA).
Results: There were 11 and 62 differentially expressed modules, 1837 and 179 DEGs, as well as 103 and 508 DEMGs acquiring identified for both COVID-19 and long COVID, notably enriched in the immune-correlated signaling pathways. The immune infiltrating cells of long COVID and COVID-19 were comparatively and respectively assessed via CIBERSORT, ssGSEA, and xCell algorithms. Subsequently, the screening of hub genes involved employing the SVM-RFE, RF, XGBoost algorithms, and logistic regression analysis. Among the 67 candidate genes were processed with machine learning algorithms and logistic regression, a subgroup consisting of CEP55, CDCA2, MELK, and DEPDC1B, was at last identified as potential biomarkers for predicting the risk of the progression into long COVID after COVID-19 infections. The predicting performance of the potential biomarkers was quantified with a ROC value of 0.8762542, which proved the combination of potential biomarkers provided the highest performance.
Conclusions: In summary, we identified a subgroup of potential biomarkers for predicting the risk of the progression into long COVID after COVID-19 infection, which could be partly elucidation of the associated molecular mechanisms for long COVID.
Keywords: COVID‐19; immune cell infiltration; long COVID; machine learning algorithms; modular pharmacology platform.
© 2025 The Author(s). Immunity, Inflammation and Disease published by John Wiley & Sons Ltd.