Real-time Markerless Tracking of Lung Tumors based on 2-D Fluoroscopy Imaging using Convolutional LSTM

IEEE Trans Radiat Plasma Med Sci. 2022 Feb;6(2):189-199. doi: 10.1109/trpms.2021.3126318. Epub 2021 Nov 13.

Abstract

Purpose: To investigate the feasibility of tracking targets in 2D fluor images using a novel deep learning network.

Methods: Our model design aims to capture the consistent motion of tumors in fluoroscopic images by neural network. Specifically, the model is trained by generative adversarial methods. The network is a coarse-to-fine architecture design. Convolutional LSTM (Long Short-term Memory) modules are introduced to account for the time correlation between different frames of the fluoroscopic images. The model was trained and tested on a digital X-CAT phantom in two studies. Series of coherent 2D fluoroscopic images representing the full respiration cycle were fed into the model to predict the localized tumor regions. In first study to test on massive scenarios, phantoms of different scales, tumor positions, sizes, and respiration amplitudes were generated to evaluate the accuracy of the model comprehensively. In second study to test on specific sample, phantoms were generated with fixed body and tumor sizes but different respiration amplitudes to investigate the effects of motion amplitude on the tracking accuracy. The tracking accuracy was quantitatively evaluated using intersection over union (IOU), tumor area difference, and centroid of mass difference (COMD).

Results: In the first comprehensive study, the mean IOU and dice coefficient achieved 0.93±0.04 and 0.96±0.02. The mean tumor area difference was 4.34%±4.04%. And the COMD was 0.16 cm and 0.07 cm on average in SI (superior-interior) and LR (left-right) directions, respectively. In the second amplitude study, the mean IOU and dice coefficient achieved 0.98 and 0.99. The mean tumor difference was 0.17%. And the COMD was 0.03cm and 0.01 cm on average in SI and LR directions, respectively. Results demonstrated the robustness of our model against breathing variations.

Conclusion: Our study showed the feasibility of using deep learning to track targets in x-ray fluoroscopic projection images without the aid of markers. The technique can be valuable for both pre- and during-treatment real-time target verification using fluoroscopic imaging in lung SBRT treatments.

Keywords: convolutional LSTM; fluoroscopy imaging; lung SBRT; neural network; target tracking.