Xu, Yun and Vong, Chi-Man and Xu, Zihao and Fu, Jianlin and Li, Junhua and Chen, Chuangquan (2026) Disentangled Multimodal Spatiotemporal Learning for Hybrid EEG-fNIRS Brain-Computer Interface. IEEE Transactions on Biomedical Engineering. pp. 1-12. DOI https://doi.org/10.1109/tbme.2026.3660692
Xu, Yun and Vong, Chi-Man and Xu, Zihao and Fu, Jianlin and Li, Junhua and Chen, Chuangquan (2026) Disentangled Multimodal Spatiotemporal Learning for Hybrid EEG-fNIRS Brain-Computer Interface. IEEE Transactions on Biomedical Engineering. pp. 1-12. DOI https://doi.org/10.1109/tbme.2026.3660692
Xu, Yun and Vong, Chi-Man and Xu, Zihao and Fu, Jianlin and Li, Junhua and Chen, Chuangquan (2026) Disentangled Multimodal Spatiotemporal Learning for Hybrid EEG-fNIRS Brain-Computer Interface. IEEE Transactions on Biomedical Engineering. pp. 1-12. DOI https://doi.org/10.1109/tbme.2026.3660692
Abstract
The hybrid EEG-fNIRS Brain-computer interface (BCI) combines the high temporal resolution of electroencephalography (EEG) with the high spatial resolution of functional near-infrared spectroscopy (fNIRS) to enable comprehensive brain activity detection. However, integrating these modalities to obtain highly discriminative features remains challenging. Most existing methods fail to effectively capture the spatiotemporal coupling features and correlations between EEG and fNIRS signals. Furthermore, these methods adopt a holistic learning paradigm for the representation of each modality, leading to unrefined and redundant multimodal representations. To address these challenges, we propose a disentangled multimodal spatiotemporal learning (DMSL) method for hybrid EEG-fNIRS BCI systems, which simultaneously performs multimodal spatiotemporal coupling and disentangled representation learning within a unified framework. Specifically, DMSL utilizes a compact convolutional module with one-dimensional temporal and spatial convolution layers to extract complex spatiotemporal patterns from each modality and introduces a multimodal attention interaction module to comprehensively capture the inter-modality correlations, enhancing the representations for each modality. Subsequently, DMSL designs an adaptive multi-branch graph convolutional module based on reconstructed channels to effectively capture the spatiotemporal coupling features, incorporating modality consistency and disparity constraints to disentangle common and modality-specific representations for each modality. These disentangled representations are finally adaptively fused to perform different task predictions. The proposed DMSL demonstrates state-of-the-art performance on publicly available datasets for mental arithmetic, motor imagery, and emotion recognition tasks, exceeding the best baselines by 2.34%, 0.59%, and 1.47%, respectively. These results demonstrate the effectiveness of DMSL in improving EEG-fNIRS decoding and its strong generalization ability in BCI applications.
| Item Type: | Article |
|---|---|
| Uncontrolled Keywords: | BCI; EEG; fNIRS; multimodal representation learning |
| Subjects: | Z Bibliography. Library Science. Information Resources > ZR Rights Retention |
| Divisions: | Faculty of Science and Health Faculty of Science and Health > Computer Science and Electronic Engineering, School of |
| SWORD Depositor: | Unnamed user with email elements@essex.ac.uk |
| Depositing User: | Unnamed user with email elements@essex.ac.uk |
| Date Deposited: | 10 Feb 2026 14:00 |
| Last Modified: | 10 Feb 2026 14:01 |
| URI: | http://repository.essex.ac.uk/id/eprint/42759 |
Available files
Filename: DMSL2026.pdf
Licence: Creative Commons: Attribution 4.0