A Video Is Worth Three Views: Trigeminal Transformers for Video-Based Person Re-Identification
Video-based person Re-Identification (Re-ID) is a hot research topic in intelligent transportation systems, which aims to retrieve video sequences of the same person under non-overlapping surveillance cameras. Compared with static images, video sequences contain more visual information from multiple views, such as spatial and temporal views. However, previous Re-ID methods usually focus on single limited views, lacking diverse observations from different views. To capture richer perceptions and extract more comprehensive representations, the authors propose a novel learning framework named Trigeminal Transformers (TMT) to tackle video-based person Re-ID. More specifically, the authors first design a View-wise Projector (VP) to jointly transform raw videos from spatial, temporal and spatial-temporal views. In addition, inspired by the great success of Vision Transformers (ViT), the authors introduce the Transformer structure for information enhancement and aggregation. In the work, three Self-view Transformers (ST) are proposed to exploit the relationships of local features for information enhancement in spatial, temporal and spatial-temporal. Moreover, a Cross-view Transformer (CT) is proposed to aggregate the multi-view features for comprehensive representations. Experimental results indicate that the approach can obtain better performance than some other state-of-the-art approaches on four public Re-ID benchmarks.
- Record URL:
-
Availability:
- Find a library where document is available. Order URL: http://worldcat.org/oclc/41297384
-
Supplemental Notes:
- Copyright © 2024, IEEE.
-
Authors:
- Liu, X
- Zhang, P
- Yu, Chaofan
-
0000-0002-7112-7821
- Qian, X
- Yang, X
- Lu, Haoang
- Publication Date: 2024-9
Language
- English
Media Info
- Media Type: Web
- Features: References;
- Pagination: pp 12818-12828
-
Serial:
- IEEE Transactions on Intelligent Transportation Systems
- Volume: 25
- Issue Number: 9
- Publisher: Institute of Electrical and Electronics Engineers (IEEE)
- ISSN: 1524-9050
- Serial URL: http://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6979
Subject/Index Terms
- TRT Terms: Data quality; Detection and identification technologies; Human beings; Image analysis; Machine vision; Video
- Subject Areas: Safety and Human Factors; Security and Emergencies; Transportation (General);
Filing Info
- Accession Number: 01938905
- Record Type: Publication
- Files: TRIS
- Created Date: Dec 6 2024 2:15PM