WEDGE: imputation of gene expression values from single-cell RNA-seq datasets using biased matrix decomposition

Brief Bioinform. 2021 Sep 2;22(5):bbab085. doi: 10.1093/bib/bbab085.

Abstract

The low capture rate of expressed RNAs from single-cell sequencing technology is one of the major obstacles to downstream functional genomics analyses. Recently, a number of imputation methods have emerged for single-cell transcriptome data, however, recovering missing values in very sparse expression matrices remains a substantial challenge. Here, we propose a new algorithm, WEDGE (WEighted Decomposition of Gene Expression), to impute gene expression matrices by using a biased low-rank matrix decomposition method. WEDGE successfully recovered expression matrices, reproduced the cell-wise and gene-wise correlations and improved the clustering of cells, performing impressively for applications with sparse datasets. Overall, this study shows a potent approach for imputing sparse expression matrix data, and our WEDGE algorithm should help many researchers to more profitably explore the biological meanings embedded in their single-cell RNA sequencing datasets. The source code of WEDGE has been released at https://github.com/QuKunLab/WEDGE.

Keywords: denoising; imputation; matrix decomposition; single-cell RNA-seq.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • COVID-19 / blood
  • COVID-19 / genetics
  • COVID-19 / virology
  • Cluster Analysis
  • Computational Biology / methods*
  • Computer Simulation
  • Gene Expression Profiling / methods*
  • Genomics / methods
  • Humans
  • Leukocytes, Mononuclear / classification
  • Leukocytes, Mononuclear / metabolism
  • RNA-Seq / methods*
  • Reproducibility of Results
  • SARS-CoV-2 / physiology
  • Severity of Illness Index
  • Single-Cell Analysis / methods*