A data-driven dimensionality-reduction algorithm for the exploration of patterns in biomedical data

Nat Biomed Eng. 2021 Jun;5(6):624-635. doi: 10.1038/s41551-020-00635-3. Epub 2020 Nov 2.

Abstract

Dimensionality reduction is widely used in the visualization, compression, exploration and classification of data. Yet a generally applicable solution remains unavailable. Here, we report an accurate and broadly applicable data-driven algorithm for dimensionality reduction. The algorithm, which we named 'feature-augmented embedding machine' (FEM), first learns the structure of the data and the inherent characteristics of the data components (such as central tendency and dispersion), denoises the data, increases the separation of the components, and then projects the data onto a lower number of dimensions. We show that the technique is effective at revealing the underlying dominant trends in datasets of protein expression and single-cell RNA sequencing, computed tomography, electroencephalography and wearable physiological sensors.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Biomedical Research / statistics & numerical data*
  • Datasets as Topic*
  • Electroencephalography / statistics & numerical data
  • Humans
  • Multifactor Dimensionality Reduction / statistics & numerical data*
  • Protein Biosynthesis
  • Sequence Analysis, RNA / statistics & numerical data
  • Single-Cell Analysis / statistics & numerical data
  • Tomography, X-Ray Computed / statistics & numerical data