Road surface semantic segmentation for autonomous driving

PeerJ Comput Sci. 2024 Sep 25:10:e2250. doi: 10.7717/peerj-cs.2250. eCollection 2024.

Abstract

Although semantic segmentation is widely employed in autonomous driving, its performance in segmenting road surfaces falls short in complex traffic environments. This study proposes a frequency-based semantic segmentation with a transformer (FSSFormer) based on the sensitivity of semantic segmentation to frequency information. Specifically, we propose a weight-sharing factorized attention to select important frequency features that can improve the segmentation performance of overlapping targets. Moreover, to address boundary information loss, we used a cross-attention method combining spatial and frequency features to obtain further detailed pixel information. To improve the segmentation accuracy in complex road scenarios, we adopted a parallel-gated feedforward network segmentation method to encode the position information. Extensive experiments demonstrate that the mIoU of FSSFormer increased by 2% compared with existing segmentation methods on the Cityscapes dataset.

Keywords: Cross-attention combining spatial and frequency features; Parallel-gated feedforward network; Semantic segmentation; Transformer; Weight-sharing factorized attention.

Grants and funding

This work was supported by the National Natural Science Foundation of China project (51278227), the Natural Science Foundation of Heilongjiang (LH2022F052), the National Natural Science Foundation Training Project of Jiamusi University (JMSUGPZR2022-015), the Space-Land Collaborative Smart Agriculture Innovation team (2023-KYYWF-0638), the Jiamusi University "East Pole" academic team project (DJXSTD202417) and the Doctoral Program of Jiamusi University (JMSUBZ2022-13). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.