Mapping the landscape of histomorphological cancer phenotypes using self-supervised learning on unannotated pathology slides

Nat Commun. 2024 Jun 11;15(1):4596. doi: 10.1038/s41467-024-48666-7.

Abstract

Cancer diagnosis and management depend upon the extraction of complex information from microscopy images by pathologists, which requires time-consuming expert interpretation prone to human bias. Supervised deep learning approaches have proven powerful, but are inherently limited by the cost and quality of annotations used for training. Therefore, we present Histomorphological Phenotype Learning, a self-supervised methodology requiring no labels and operating via the automatic discovery of discriminatory features in image tiles. Tiles are grouped into morphologically similar clusters which constitute an atlas of histomorphological phenotypes (HP-Atlas), revealing trajectories from benign to malignant tissue via inflammatory and reactive phenotypes. These clusters have distinct features which can be identified using orthogonal methods, linking histologic, molecular and clinical phenotypes. Applied to lung cancer, we show that they align closely with patient survival, with histopathologically recognised tumor types and growth patterns, and with transcriptomic measures of immunophenotype. These properties are maintained in a multi-cancer study.

MeSH terms

  • Deep Learning
  • Humans
  • Lung Neoplasms* / genetics
  • Lung Neoplasms* / pathology
  • Neoplasms / genetics
  • Neoplasms / pathology
  • Phenotype*
  • Supervised Machine Learning*
  • Transcriptome