Embedded Control Gate Fusion and Attention Residual Learning for RGB–Thermal Urban Scene Parsing

The semantic segmentation of road scenes is an important task in autonomous driving. Deep learning has enabled the development of a variety of semantic segmentation networks using RGB and depth data. However, poor lighting conditions and long-distance sensing limit the applicability of RGB and depth cameras. Nevertheless, many existing methods still rely on precise depth maps for scene segmentation. Unlike depth information, thermal imaging provides a visual heat representation that remains accurate under a variety of lighting conditions and over longer distances. For robust and accurate segmentation of scenes collected during autonomous driving, the authors used the advanced MobileNetV2 network for feature extraction and a fusion strategy with an embedded control gate. In addition, they adopted an encoder–decoder scheme for semantic segmentation and developed an attention residual learning strategy to restore the resolution of the feature map. Finally, semantic and boundary supervision is introduced to optimize parameters of the proposed network. Experimental results show that the proposed network outperforms existing networks on segmentation of urban scenes, and their network can be generalized to depth data.

Language

  • English

Media Info

Subject/Index Terms

Filing Info

  • Accession Number: 01893793
  • Record Type: Publication
  • Files: TRIS
  • Created Date: Sep 20 2023 11:42AM