Background and objectives: Chest X-ray (CXR) images are commonly used to diagnose respiratory and cardiovascular diseases. However, traditional manual interpretation is often subjective, time-consuming, and prone to errors, leading to inconsistent detection accuracy and poor generalization. In this paper, we present deep learning-based object detection methods for automatically identifying and annotating abnormal regions in CXR images.
Methods: We developed and tested our models using disease-labeled CXR images and location-bounding boxes from E-Da Hospital. Given the prevalence of normal images over diseased ones in clinical settings, we created various training datasets and approaches to assess how different proportions of background images impact model performance. To address the issue of limited examples for certain diseases, we also investigated few-shot object detection techniques. We compared convolutional neural networks (CNNs) and Transformer-based models to determine the most effective architecture for medical image analysis.
Results: The findings show that background image proportions greatly influenced model inference. Moreover, schemes incorporating binary classification consistently improved performance, and CNN-based models outperformed Transformer-based models across all scenarios.
Conclusions: We have developed a more efficient and reliable system for the automated detection of disease labels and location bounding boxes in CXR images.
Keywords: chest X-rays; deep learning; few-shot object detection; object detection.