Voice Analysis in Dogs with Deep Learning: Development of a Fully Automatic Voice Analysis System for Bioacoustics Studies

Sensors (Basel). 2024 Dec 13;24(24):7978. doi: 10.3390/s24247978.

Abstract

Extracting behavioral information from animal sounds has long been a focus of research in bioacoustics, as sound-derived data are crucial for understanding animal behavior and environmental interactions. Traditional methods, which involve manual review of extensive recordings, pose significant challenges. This study proposes an automated system for detecting and classifying animal vocalizations, enhancing efficiency in behavior analysis. The system uses a preprocessing step to segment relevant sound regions from audio recordings, followed by feature extraction using Short-Time Fourier Transform (STFT), Mel-frequency cepstral coefficients (MFCCs), and linear-frequency cepstral coefficients (LFCCs). These features are input into convolutional neural network (CNN) classifiers to evaluate performance. Experimental results demonstrate the effectiveness of different CNN models and feature extraction methods, with AlexNet, DenseNet, EfficientNet, ResNet50, and ResNet152 being evaluated. The system achieves high accuracy in classifying vocal behaviors, such as barking and howling in dogs, providing a robust tool for behavioral analysis. The study highlights the importance of automated systems in bioacoustics research and suggests future improvements using deep learning-based methods for enhanced classification performance.

Keywords: CNN; audio processing; automatic behavior analysis; bioacoustics.

MeSH terms

  • Acoustics
  • Animals
  • Deep Learning*
  • Dogs
  • Fourier Analysis
  • Neural Networks, Computer*
  • Vocalization, Animal* / physiology
  • Voice / physiology

Grants and funding

This research received no external funding.