External beam radiotherapy is a commonly used treatment option for patients with cancer in the thoracic and abdominal regions. However, respiratory motion constitutes a major limitation during the intervention. It may stray the pre-defined target and trajectories determined during planning from the actual anatomy. We propose a novel framework to predict the in-plane organ motion. We introduce a recurrent encoder-decoder architecture which leverages feature representations at multiple scales. It simultaneously learns to map dense deformations between consecutive images from a given input sequence and to extrapolate them through time. Subsequently, several cascade-arranged spatial transformers use the predicted deformation fields to generate a future image sequence. We propose the use of a composite loss function which minimizes the difference between ground-truth and predicted images while maintaining smooth deformations. Our model is trained end-to-end in an unsupervised manner, thus it does not require additional information beyond image data. Moreover, no pre-processing steps such as segmentation or registration are needed. We report results on 85 different cases (healthy subjects and patients) belonging to multiples datasets across different imaging modalities. Experiments were aimed at investigating the importance of the proposed multi-scale architecture design and the effect of increasing the number of predicted frames on the overall accuracy of the model. The proposed model was able to predict vessel positions in the next temporal image with a median accuracy of 0.45 (0.55) mm, 0.45 (0.74) mm and 0.28 (0.58) mm in MRI, US and CT datasets, respectively. The obtained results show the strong potential of the model by achieving accurate matching between the predicted and target images on several imaging modalities.
Keywords: Deep learning; Free-breathing; LSTM; Liver; Lungs; Motion prediction; Radiotherapy.
Copyright © 2020. Published by Elsevier B.V.