Autonomous driving systems heavily depend on perception tasks for optimal performance. However, the prevailing datasets are primarily focused on scenarios with clear visibility (i.e., sunny and daytime). This concentration poses challenges in training deep-learning-based perception models for environments with adverse conditions (e.g., rainy and nighttime). In this paper, we propose an unsupervised network designed for the translation of images from day-to-night to solve the ill-posed problem of learning the mapping between domains with unpaired data. The proposed method involves extracting both semantic and geometric information from input images in the form of attention maps. We assume that the multi-task network can extract semantic and geometric information during the estimation of semantic segmentation and depth maps, respectively. The image-to-image translation network integrates the two distinct types of extracted information, employing them as spatial attention maps. We compare our method with related works both qualitatively and quantitatively. The proposed method shows both qualitative and qualitative improvements in visual presentation over related work.
Keywords: data augmentation; deep learning; domain adaptation; generative adversarial networks; generative model; image-to-image translation.