Multi-Level Contextual RNNs With Attention Model for Scene Labeling

Image context in image is crucial for improving scene labeling. While the existing methods only exploit local context generated from a small surrounding area of an image patch or a pixel, the long-range and global contextual information is often ignored. To handle this issue, the authors propose a novel approach for scene labeling by multi-level contextual recurrent neural networks (RNNs). they encode three kinds of contextual cues, viz., local context, global context, and image topic context in structural RNNs to model long-range local and global dependencies in an image. In this way, the author's method is able to “see” the image in terms of both long-range local and holistic views, and make a more reliable inference for image labeling. Besides, they integrate the proposed contextual RNNs into hierarchical convolutional neural networks, and exploit dependence relationships at multiple levels to provide rich spatial and semantic information. Moreover, they adopt an attention model to effectively merge multiple levels and show that it outperforms average- or max-pooling fusion strategies. Extensive experiments demonstrate that the proposed approach achieves improved results on the CamVid, KITTI, SiftFlow, Stanford Background, and Cityscapes data sets.

Language

  • English

Media Info

Subject/Index Terms

Filing Info

  • Accession Number: 01690038
  • Record Type: Publication
  • Files: TLIB, TRIS
  • Created Date: Nov 14 2018 1:52PM