Test-Retest Reproducibility of 18F-FDG PET/CT Uptake in Cancer Patients Within a Qualified and Calibrated Local Network

Brenda F Kurland; Lanell M Peterson; Andrew T Shields; Jean H Lee; Darrin W Byrd; Alena Novakova-Jiresova; Mark Muzi; Jennifer M Specht; David A Mankoff; Hannah M Linden; Paul E Kinahan

doi:10.2967/jnumed.118.209544

Test-Retest Reproducibility of ¹⁸F-FDG PET/CT Uptake in Cancer Patients Within a Qualified and Calibrated Local Network

J Nucl Med. 2019 May;60(5):608-614. doi: 10.2967/jnumed.118.209544. Epub 2018 Oct 25.

Affiliations

¹ Department of Biostatistics, University of Pittsburgh, Pittsburgh, Pennsylvania bfk10@pitt.edu.
² Division of Medical Oncology, University of Washington/Seattle Cancer Care Alliance, Seattle, Washington.
³ Department of Radiology, University of Washington, Seattle, Washington; and.
⁴ Department of Radiology, University of Pennsylvania, Philadelphia, Pennsylvania.

Abstract

Calibration and reproducibility of quantitative ¹⁸F-FDG PET measures are essential for adopting integral ¹⁸F-FDG PET/CT biomarkers and response measures in multicenter clinical trials. We implemented a multicenter qualification process using National Institute of Standards and Technology-traceable reference sources for scanners and dose calibrators, and similar patient and imaging protocols. We then assessed SUV in patient test-retest studies. Methods: Five ¹⁸F-FDG PET/CT scanners from 4 institutions (2 in a National Cancer Institute-designated Comprehensive Cancer Center, 3 in a community-based network) were qualified for study use. Patients were scanned twice within 15 d, on the same scanner (n = 10); different but same model scanners within an institution (n = 2); or different model scanners at different institutions (n = 11). SUV_max was recorded for lesions, and SUV_mean for normal liver uptake. Linear mixed models with random intercept were fitted to evaluate test-retest differences in multiple lesions per patient and to estimate the concordance correlation coefficient. Bland-Altman plots and repeatability coefficients were also produced. Results: In total, 162 lesions (82 bone, 80 soft tissue) were assessed in patients with breast cancer (n = 17) or other cancers (n = 6). Repeat scans within the same institution, using the same scanner or 2 scanners of the same model, had an average difference in SUV_max of 8% (95% confidence interval, 6%-10%). For test-retest on different scanners at different sites, the average difference in lesion SUV_max was 18% (95% confidence interval, 13%-24%). Normal liver uptake (SUV_mean) showed an average difference of 5% (95% confidence interval, 3%-10%) for the same scanner model or institution and 6% (95% confidence interval, 3%-11%) for different scanners from different institutions. Protocol adherence was good; the median difference in injection-to-acquisition time was 2 min (range, 0-11 min). Test-retest SUV_max variability was not explained by available information on protocol deviations or patient or lesion characteristics. Conclusion:¹⁸F-FDG PET/CT scanner qualification and calibration can yield highly reproducible test-retest tumor SUV measurements. Our data support use of different qualified scanners of the same model for serial studies. Test-retest differences from different scanner models were greater; more resolution-dependent harmonization of scanner protocols and reconstruction algorithms may be capable of reducing these differences to values closer to same-scanner results.

Keywords: 18F-FDG PET/CT; SUV; quantitative imaging; reproducibility; test–retest.

Publication types

Research Support, N.I.H., Extramural

MeSH terms

Adult
Aged
Biological Transport
Calibration
Female
Fluorodeoxyglucose F18 / metabolism*
Humans
Liver / diagnostic imaging
Liver / metabolism
Male
Middle Aged
Neoplasms / diagnostic imaging*
Neoplasms / metabolism*
Positron Emission Tomography Computed Tomography*
Reproducibility of Results

Substances

Fluorodeoxyglucose F18