Classification of COVID-19 Patients into Clinically Relevant Subsets by a Novel Machine Learning Pipeline Using Transcriptomic Features

Int J Mol Sci. 2023 Mar 3;24(5):4905. doi: 10.3390/ijms24054905.

Abstract

The persistent impact of the COVID-19 pandemic and heterogeneity in disease manifestations point to a need for innovative approaches to identify drivers of immune pathology and predict whether infected patients will present with mild/moderate or severe disease. We have developed a novel iterative machine learning pipeline that utilizes gene enrichment profiles from blood transcriptome data to stratify COVID-19 patients based on disease severity and differentiate severe COVID cases from other patients with acute hypoxic respiratory failure. The pattern of gene module enrichment in COVID-19 patients overall reflected broad cellular expansion and metabolic dysfunction, whereas increased neutrophils, activated B cells, T-cell lymphopenia, and proinflammatory cytokine production were specific to severe COVID patients. Using this pipeline, we also identified small blood gene signatures indicative of COVID-19 diagnosis and severity that could be used as biomarker panels in the clinical setting.

Keywords: COVID-19; bioinformatics; classification; machine learning; severity; transcriptomics.

MeSH terms

  • COVID-19 Testing
  • COVID-19*
  • Humans
  • Machine Learning
  • Pandemics
  • SARS-CoV-2
  • Transcriptome

Grants and funding

This research was funded by the RILITE Foundation.