Improving Automatic Polyp Detection Using CNN by Exploiting Temporal Dependency in Colonoscopy Video

IEEE J Biomed Health Inform. 2020 Jan;24(1):180-193. doi: 10.1109/JBHI.2019.2907434. Epub 2019 Apr 1.

Abstract

Automatic polyp detection has been shown to be difficult due to various polyp-like structures in the colon and high interclass variations in polyp size, color, shape, and texture. An efficient method should not only have a high correct detection rate (high sensitivity) but also a low false detection rate (high precision and specificity). The state-of-the-art detection methods include convolutional neural networks (CNN). However, CNNs have shown to be vulnerable to small perturbations and noise; they sometimes miss the same polyp appearing in neighboring frames and produce a high number of false positives. We aim to tackle this problem and improve the overall performance of the CNN-based object detectors for polyp detection in colonoscopy videos. Our method consists of two stages: a region of interest (RoI) proposal by CNN-based object detector networks and a false positive (FP) reduction unit. The FP reduction unit exploits the temporal dependencies among image frames in video by integrating the bidirectional temporal information obtained by RoIs in a set of consecutive frames. This information is used to make the final decision. The experimental results show that the bidirectional temporal information has been helpful in estimating polyp positions and accurately predict the FPs. This provides an overall performance improvement in terms of sensitivity, precision, and specificity compared to conventional false positive learning method, and thus achieves the state-of-the-art results on the CVC-ClinicVideoDB video data set.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Colonic Polyps / diagnostic imaging*
  • Colonoscopy / methods*
  • Humans
  • Image Interpretation, Computer-Assisted / methods*
  • Neural Networks, Computer*
  • Video Recording / methods