Computerized identification of lymph node metastasis of breast cancer (BCLNM) from whole-slide pathological images (WSIs) can largely benefit therapy decision and prognosis analysis. Besides the general challenges of computational pathology, like extra-high resolution, very expensive fine-grained annotation, etc., two particular difficulties with this task lie in (1) modeling the significant inter-tumoral heterogeneity in BCLNM pathological images, and (2) identifying micro-metastases, i.e., metastasized tumors with tiny foci. Towards this end, this paper presents a novel weakly supervised method, termed as Prototypical Multiple Instance Learning (PMIL), to learn to predict BCLNM from WSIs with slide-level class labels only. PMIL introduces the well-established vocabulary-based multiple instance learning (MIL) paradigm into computational pathology, which is characterized by utilizing the so-called prototypes to model pathological data and construct WSI features. PMIL mainly consists of two innovatively designed modules, i.e., the prototype discovery module which acquires prototypes from training data by unsupervised clustering, and the prototype-based slide embedding module which builds WSI features by matching constitutive patches against the prototypes. Relative to existing MIL methods for WSI classification, PMIL has two substantial merits: (1) being more explicit and interpretable in modeling the inter-tumoral heterogeneity in BCLNM pathological images, and (2) being more effective in identifying micro-metastases. Evaluation is conducted on two datasets, i.e., the public Camelyon16 dataset and the Zbraln dataset created by ourselves. PMIL achieves an AUC of 88.2% on Camelyon16 and 98.4% on Zbraln (at 40x magnification factor), which consistently outperforms other compared methods. Comprehensive analysis will also be carried out to further reveal the effectiveness and merits of the proposed method.
Keywords: Breast cancer; Computational pathology; Lymph node metastasis; Prototypical multiple instance learning; Whole-slide images.
Copyright © 2023. Published by Elsevier B.V.