Pneumothorax detection and segmentation from chest X-ray radiographs using a patch-based fully convolutional encoder-decoder network

Jakov Ivan S Dumbrique; Reynan B Hernandez; Juan Miguel L Cruz; Ryan M Pagdanganan; Prospero C Naval Jr

doi:10.3389/fradi.2024.1424065

Pneumothorax detection and segmentation from chest X-ray radiographs using a patch-based fully convolutional encoder-decoder network

Front Radiol. 2024 Dec 11:4:1424065. doi: 10.3389/fradi.2024.1424065. eCollection 2024.

Authors

Jakov Ivan S Dumbrique^{1

2}, Reynan B Hernandez^{3

4}, Juan Miguel L Cruz⁴, Ryan M Pagdanganan⁴, Prospero C Naval Jr¹

Affiliations

¹ Computer Vision and Machine Intelligence Group, Department of Computer Science, University of the Philippines-Diliman, Quezon City, Philippines.
² Department of Mathematics, Ateneo de Manila University, Quezon City, Philippines.
³ Ateneo School of Medicine and Public Health, Pasig, Philippines.
⁴ Department of Radiology, The Medical City, Pasig, Philippines.

Abstract

Pneumothorax, a life-threatening condition characterized by air accumulation in the pleural cavity, requires early and accurate detection for optimal patient outcomes. Chest X-ray radiographs are a common diagnostic tool due to their speed and affordability. However, detecting pneumothorax can be challenging for radiologists because the sole visual indicator is often a thin displaced pleural line. This research explores deep learning techniques to automate and improve the detection and segmentation of pneumothorax from chest X-ray radiographs. We propose a novel architecture that combines the advantages of fully convolutional neural networks (FCNNs) and Vision Transformers (ViTs) while using only convolutional modules to avoid the quadratic complexity of ViT's self-attention mechanism. This architecture utilizes a patch-based encoder-decoder structure with skip connections to effectively combine high-level and low-level features. Compared to prior research and baseline FCNNs, our model demonstrates significantly higher accuracy in detection and segmentation while maintaining computational efficiency. This is evident on two datasets: (1) the SIIM-ACR Pneumothorax Segmentation dataset and (2) a novel dataset we curated from The Medical City, a private hospital in the Philippines. Ablation studies further reveal that using a mixed Tversky and Focal loss function significantly improves performance compared to using solely the Tversky loss. Our findings suggest our model has the potential to improve diagnostic accuracy and efficiency in pneumothorax detection, potentially aiding radiologists in clinical settings.

Keywords: Vision Transformer; automatic image segmentation; chest X-rays; convolutional neural network; deep learning; diagnostic radiology; lung pathology detection; pneumothorax.

Grants and funding

The author(s) declare financial support was received for the research, authorship, and/or publication of this article. The authors would like to acknowledge the University Research Council (URC) of the Ateneo de Manila University for the funding of this work (grant URC-11-2020) and the DOST-SEI ERDT Program for the FRDG of this publication.