Automated amyloid-PET image classification can support clinical assessment and increase diagnostic confidence. Three automated approaches using global cut-points derived from Receiver Operating Characteristic (ROC) analysis, machine learning (ML) algorithms with regional SUVr values, and deep learning (DL) network with 3D image input were compared under various conditions: number of training data, radiotracers, and cohorts. 276 [11C]PiB and 209 [18F]AV45 PET images from ADNI database and our local cohort were used. Global mean and maximum SUVr cut-points were derived using ROC analysis. 68 ML models were built using regional SUVr values and one DL network was trained with classifications of two visual assessments - manufacturer's recommendations (gray-scale) and with visually guided reference region scaling (rainbow-scale). ML-based classification achieved similarly high accuracy as ROC classification, but had better convergence between training and unseen data, with a smaller number of training data. Naïve Bayes performed the best overall among the 68 ML algorithms. Classification with maximum SUVr cut-points yielded higher accuracy than with mean SUVr cut-points, particularly for cohorts showing more focal uptake. DL networks can support the classification of definite cases accurately but performed poorly for equivocal cases. Rainbow-scale standardized image intensity scaling and improved inter-rater agreement. Gray-scale detects focal accumulation better, thus classifying more amyloid-positive scans. All three approaches generally achieved higher accuracy when trained with rainbow-scale classification. ML yielded similarly high accuracy as ROC, but with better convergence between training and unseen data, and further work may lead to even more accurate ML methods.
Keywords: Alzheimer’s disease; Deep Learning; Equivocal; Machine Learning; Positron emission tomography (PET); Visual interpretation.
© 2022. The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature.