REDT: a specialized transformer model for the respiratory phase and adventitious sound detection

Physiol Meas. 2025 Jan 27. doi: 10.1088/1361-6579/adaf08. Online ahead of print.

Abstract

Background and objective: In contrast to respiratory sound classification, respiratory phase and adventitious sound event detection provides more detailed and accurate respiratory information, which is clinically important for respiratory disorders. However, current respiratory sound event detection models mainly use convolutional neural networks to generate frame-level predictions. 
A significant drawback of the frame-based model lies in its pursuit of optimal frame-level predictions rather than the best event-level ones. Moreover, it demands post-processing and is incapable of being trained in an entirely end-to-end fashion. Based on the above research status, this paper proposes an event-based Transformer method - Respiratory Events Detection Transformer (REDT) for multi-class respiratory sound event detection task to achieve efficient recognition and localization of the respiratory phase and adventitious sound events.

Approach: Firstly, REDT approach employs the Transformer for time-frequency analysis of respiratory sound signals to extract essential features. Secondly, REDT converts these features into timestamp representations and achieves sound event detection by predicting the location and category of timestamps.

Main results: Our method is validated on the public dataset HF_Lung_V1. The experimental results show that our F1 scores for inspiration, expiration, Continuous Adventitious Sound(CAS) and Discontinuous Adventitious Sound(DAS) are 90.5%, 77.3%, 78.9%, and 59.4%, respectively.

Significance: These results demonstrate the method's significant performance in respiratory sound event detection.

Keywords: Event-based detection; Fine-tuning pretrained model; Hierarchical Transformer; Respiratory sound event detection.