Objectives: To create a supervised machine learning algorithm aimed at predicting an optimal cerebrospinal fluid (CSF) dilution when determining virus specific antibody indices to reduce the need for repeated tests.
Methods: The CatBoost model was trained, optimized, and tested on a dataset with five input variables: albumin quotient, immunoglobulin G (IgG) in CSF, IgG quotient (QIgG), intrathecal synthesis (ITS) and limes quotient (LIM IgG). Albumin and IgG concentrations in CSF and serum were performed by immunonephelometry on Atellica NEPH 630 (Siemens Healthineers, Erlangen, Germany) and ITS and LIM IgG were calculated according to Reiber. Concentrations of IgG antibodies to measles, rubella, varicella zoster and herpes simplex 1/2 viruses were analysed in CSF and serum by ELISA (Euroimmun, Lübeck, Germany). Optimal CSF dilution was defined for each virus and used as a classification variable while the standard operating procedure was set to start at 2×-dilution of CSF.
Results: The dataset included 571 samples with the imbalanced distribution of the optimal CSF dilutions: 2× dilution n=440, 3× dilution n=109, 4× dilution n=22. The optimized CatBoost model achieved an area under the curve (AUC) score of 0.971, and a test accuracy of 0.900. The model falsely classified 14 (9.9 %) samples of the testing set but reduced the need for repeated testing compared to the standard protocol by 42 %. The output of the CatBoost model is mostly dependant on the QIgG, ITS and CSF IgG variables.
Conclusions: An accurate algorithm was achieved for predicting the optimal CSF dilution, which reduces the number of test repeats.
Keywords: MRZ reaction; cerebrospinal fluid; machine learning; model optimization.
© 2023 Walter de Gruyter GmbH, Berlin/Boston.