VoPiFNet: Voxel-Pixel Fusion Network for Multi-Class 3D Object Detection
Many LiDAR-based methods for detecting large objects, single-class object detection, or under easy situations were claimed to perform well. However, due to their failure to exploit image semantics, their performance in detecting small targets or under challenging conditions does not exceed that of fusion-based approaches. In order to elevate the detection performance in a complex environment, this paper proposes a multi-modal and multi-class 3D object detection network, named Voxel-Pixel Fusion Network (VoPiFNet). Within this network, the authors design a key novel component called the Voxel-Pixel Fusion Layer, which takes advantage of the geometric relation of a voxel-pixel pair and effectively fuses voxel features and pixel features with the cross-modal attention mechanism. Moreover, after considering the characteristics of the voxel-pixel pair, the authors design four parameters to guide and enhance this fusion effect. This proposed layer can be integrated with voxel-based 3D LiDAR detectors and 2D image detectors. Finally, the proposed method is evaluated on the public KITTI benchmark dataset for multi-class 3D object detection at different levels. Extensive experiments show that the authors' method outperforms the state-of-the-art methods in detecting challenging pedestrian category and achieve promising performance in overall 3D mean average precision (mAP).
- Record URL:
-
Availability:
- Find a library where document is available. Order URL: http://worldcat.org/oclc/41297384
-
Supplemental Notes:
- Copyright © 2024, IEEE.
-
Authors:
- Wang, Chia-Hung
-
0000-0001-5210-6323
- Chen, Hsueh-Wei
-
0000-0002-3829-2155
- Chen, Yi
-
0000-0003-1446-7982
- Hsiao, JPei-Yung
-
0000-0003-1750-7118
- Fu, Li-Chen
-
0000-0002-6947-7646
- Publication Date: 2024-8
Language
- English
Media Info
- Media Type: Web
- Features: References;
- Pagination: pp 8527-8537
-
Serial:
- IEEE Transactions on Intelligent Transportation Systems
- Volume: 25
- Issue Number: 8
- Publisher: Institute of Electrical and Electronics Engineers (IEEE)
- ISSN: 1524-9050
- Serial URL: http://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6979
Subject/Index Terms
- TRT Terms: Data fusion; Image processing; Object detection; Pedestrian detectors
- Subject Areas: Data and Information Technology; Highways; Operations and Traffic Management; Pedestrians and Bicyclists;
Filing Info
- Accession Number: 01935921
- Record Type: Publication
- Files: TRIS
- Created Date: Nov 5 2024 11:27AM