Zhao, Xu and Tang, Chao and Hu, Huosheng and Wang, Wenjian and Qiao, Shuo and Tong, Anyang (2025) Attention mechanism based multimodal feature fusion network for human action recognition. Journal of Visual Communication and Image Representation. p. 104459. DOI https://doi.org/10.1016/j.jvcir.2025.104459
Zhao, Xu and Tang, Chao and Hu, Huosheng and Wang, Wenjian and Qiao, Shuo and Tong, Anyang (2025) Attention mechanism based multimodal feature fusion network for human action recognition. Journal of Visual Communication and Image Representation. p. 104459. DOI https://doi.org/10.1016/j.jvcir.2025.104459
Zhao, Xu and Tang, Chao and Hu, Huosheng and Wang, Wenjian and Qiao, Shuo and Tong, Anyang (2025) Attention mechanism based multimodal feature fusion network for human action recognition. Journal of Visual Communication and Image Representation. p. 104459. DOI https://doi.org/10.1016/j.jvcir.2025.104459
Abstract
Current human action recognition (HAR) methods focus on integrating multiple data modalities, such as skeleton data and RGB data. However, they struggle to exploit motion correlation information in skeleton data and rely on spatial representations from RGB modalities. This paper proposes a novel Attention-based Multimodal Feature Integration Network (AMFI-Net) designed to enhance modal fusion and improve recognition accuracy. First, RGB and skeleton data undergo multi-level preprocessing to obtain differential movement representations, which are then input into a heterogeneous network for separate multimodal feature extraction. Next, an adaptive fusion strategy is employed to enhance the integration of these multimodal features. Finally, the network assesses the confidence level of weighted skeleton information to determine the extent and type of appearance information to be used in the final feature integration. Experiments conducted on the NTU-RGB + D dataset demonstrate that the proposed method is feasible, leading to significant improvements in human action recognition accuracy.
Item Type: | Article |
---|---|
Divisions: | Faculty of Science and Health Faculty of Science and Health > Computer Science and Electronic Engineering, School of |
SWORD Depositor: | Unnamed user with email elements@essex.ac.uk |
Depositing User: | Unnamed user with email elements@essex.ac.uk |
Date Deposited: | 01 May 2025 14:28 |
Last Modified: | 01 May 2025 14:31 |
URI: | http://repository.essex.ac.uk/id/eprint/40769 |
Available files
Filename: JVCI-V110-2025-104459.pdf
Licence: Creative Commons: Attribution-Noncommercial-No Derivative Works 4.0
Embargo Date: 21 April 2026