Analysis of high-molecular-weight proteins using MALDI-TOF MS and machine learning for the differentiation of clinically relevant Clostridioides difficile ribotypes

Eur J Clin Microbiol Infect Dis. 2024 Dec 17. doi: 10.1007/s10096-024-05023-2. Online ahead of print.

Abstract

Purpose: Clostridioides difficile is the main cause of antibiotic related diarrhea and some ribotypes (RT), such as RT027, RT181 or RT078, are considered high risk clones. A fast and reliable approach for C. difficile ribotyping is needed for a correct clinical approach. This study analyses high-molecular-weight proteins for C. difficile ribotyping with MALDI-TOF MS.

Methods: Sixty-nine isolates representative of the most common ribotypes in Europe were analyzed in the 17,000-65,000 m/z region and classified into 4 categories (RT027, RT181, RT078 and 'Other RTs'). Five supervised Machine Learning algorithms were tested for this purpose: K-Nearest Neighbors, Support Vector Machine, Partial Least Squares-Discriminant Analysis, Random Forest (RF) and Light-Gradient Boosting Machine (GBM).

Results: All algorithms yielded cross-validation results > 70%, being RF and Light-GBM the best performing, with 88% of agreement. Area under the ROC curve of these two algorithms was > 0.9. RT078 was correctly classified with 100% accuracy and isolates from the RT181 category could not be differentiated from RT027.

Conclusions: This study shows the possibility of rapid discrimination of relevant C. difficile ribotypes by using MALDI-TOF MS. This methodology reduces the time, costs and laboriousness of current reference methods.

Keywords: Clostridioides difficile; MALDI-TOF MS; Machine learning; Ribotyping.