Machine learning for the identification of clinically significant prostate cancer on MRI: a meta-analysis

Eur Radiol. 2020 Dec;30(12):6877-6887. doi: 10.1007/s00330-020-07027-w. Epub 2020 Jun 30.

Abstract

Objectives: The aim of this study was to systematically review the literature and perform a meta-analysis of machine learning (ML) diagnostic accuracy studies focused on clinically significant prostate cancer (csPCa) identification on MRI.

Methods: Multiple medical databases were systematically searched for studies on ML applications in csPCa identification up to July 31, 2019. Two reviewers screened all papers independently for eligibility. The area under the receiver operating characteristic curves (AUC) was pooled to quantify predictive accuracy. A random-effects model estimated overall effect size while statistical heterogeneity was assessed with the I2 value. A funnel plot was used to investigate publication bias. Subgroup analyses were performed based on reference standard (biopsy or radical prostatectomy) and ML type (deep and non-deep).

Results: After the final revision, 12 studies were included in the analysis. Statistical heterogeneity was high both in overall and in subgroup analyses. The overall pooled AUC for ML in csPCa identification was 0.86, with 0.81-0.91 95% confidence intervals (95%CI). The biopsy subgroup (n = 9) had a pooled AUC of 0.85 (95%CI = 0.79-0.91) while the radical prostatectomy one (n = 3) of 0.88 (95%CI = 0.76-0.99). Deep learning ML (n = 4) had a 0.78 AUC (95%CI = 0.69-0.86) while the remaining 8 had AUC = 0.90 (95%CI = 0.85-0.94).

Conclusions: ML pipelines using prostate MRI to identify csPCa showed good accuracy and should be further investigated, possibly with better standardisation in design and reporting of results.

Key points: • Overall pooled AUC was 0.86 with 0.81-0.91 95% confidence intervals. • In the reference standard subgroup analysis, algorithm accuracy was similar with pooled AUCs of 0.85 (0.79-0.91 95% confidence intervals) and 0.88 (0.76-0.99 95% confidence intervals) for studies employing biopsies and radical prostatectomy, respectively. • Deep learning pipelines performed worse (AUC = 0.78, 0.69-0.86 95% confidence intervals) than other approaches (AUC = 0.90, 0.85-0.94 95% confidence intervals).

Keywords: Machine learning; Magnetic resonance imaging; Meta-analysis; Prostatic neoplasms.

Publication types

  • Meta-Analysis

MeSH terms

  • Algorithms
  • Area Under Curve
  • Biopsy
  • Diagnosis, Computer-Assisted / methods*
  • Humans
  • Machine Learning*
  • Magnetic Resonance Imaging*
  • Male
  • Prevalence
  • Prostatectomy
  • Prostatic Neoplasms / diagnostic imaging*
  • Prostatic Neoplasms / pathology
  • Prostatic Neoplasms / surgery*
  • ROC Curve
  • Reference Standards