Evaluation of Vertical Level Differences Between Left and Right Vocal Folds Using Artificial Intelligence System in Excised Canine Larynx

J Voice. 2024 Jan 11:S0892-1997(23)00385-5. doi: 10.1016/j.jvoice.2023.11.025. Online ahead of print.

Abstract

Objectives: This study aimed to establish an artificial intelligence (AI) system to classify vertical level differences between vocal folds during vocalization and to evaluate the accuracy of the classification.

Methods: We designed models with different depths between the right and left vocal folds using an excised canine larynx. Video files for the data set were obtained using a high-speed camera system and a color complementary metal oxide semiconductor camera with global shutter. The data sets were divided into training, validation, and testing. We used 20,000 images for building the model and 8000 images for testing. To perform deep learning multiclass classification and to estimate the vertical level difference, we introduced DenseNet121-ConvLSTM.

Results: The model was trained several times using different numbers of epochs. We achieved the most optimal results at 100 epochs, and the batch size used during training was 16. The proposed DenseNet121-ConvLSTM model achieved classification accuracies of 99.5% and 88.0% for training and testing, respectively. After verification using an external data set, the overall accuracy, precision, recall, and f1-score were 90.8%, 91.6%, 90.9%, and 91.2%, respectively.

Conclusions: The newly developed AI system may be an easy and accurate method for classifying superior and inferior vertical level differences between vocal folds. Thus, this AI system can be applied and may help in the assessment of vertical level differences in patients with unilateral vocal fold paralysis.

Keywords: AI system; Canine larynx; Deep learning; Vertical level difference; Vocal fold.