Purpose: To compare liver lesion volume measurement on multiple 3D software platforms using a liver phantom.
Methods: An anthropomorphic phantom constructed with ten liver lesions of varying size, attenuation, and shape with known volume and long axis measurement was scanned (120 kVp, 80-440 smart mA, NI 12). DICOM data were uploaded to five commercially available 3D visualization systems and manual tumor volume was obtained by three-independent readers. Accuracy and reproducibility of linear and volume measurements were compared. The two most promising systems were then compared with an additional prototype system by two readers using both manual and semi-automated measurement with similar comparison between linear and volume measures. Measurements were performed on 5- and 1.25-mm data sets. Inter- and intra-observer variability was also assessed.
Results: Overall mean % volume error on the five commercially available software systems (averaging all ten liver lesions among all three readers) was 8.0% ± 7.5%, 13.7% ± 11.2%, 14.2% ± 15.2%, 16.4% ± 14.8 %, and 16.9% ± 13.8%, varying almost twofold across vendor. Moderate inter-observer variability was present. Volume measurement was slightly more accurate than linear measurement, but linear measurement was more reproducible across readers and systems. On the two "best" systems, the manual measurement method was more accurate than the automated method (p = 0.001). The prototype system demonstrated superior semi-automated assessment, with a mean % volume error of 5.3% ± 4.1% (vs. 17.8% ± 11.1% and 31.5% ± 19.7%, p < 0.001), with improved inter- and intra-observer variability.
Conclusions: Accuracy and reproducibility of volume assessment of liver lesions varies significantly by vendor, which has important implications for clinical use.