Multimodal Dual Perception Fusion Framework for Multimodal Affective Analysis

Lu, Qiang and Sun, Xia and Long, Yunfei and Zhao, Xiaodi and Wang, Zou and Feng, Jun and Wang, Xuxin (2024) Multimodal Dual Perception Fusion Framework for Multimodal Affective Analysis. Information Fusion, 115. p. 102747. DOI https://doi.org/10.1016/j.inffus.2024.102747

Abstract

The misuse of social platforms and the difficulty in regulating post contents have culminated in a surge of negative sentiments, sarcasms, and the rampant spread of fake news. In response, Multimodal sentiment analysis, sarcasm detection and fake news detection based on image and text have attracted considerable attention recently. Due to that these areas share semantic and sentiment features and confront related fusion challenges in deciphering complex human expressions across different modalities, integrating these multimodal classification tasks that share commonalities across different scenarios into a unified framework is expected to simplify research in sentiment analysis, and enhance the effectiveness of classification tasks involving both semantic and sentiment modeling. Therefore, we consider integral components of a broader spectrum of research known as multimodal affective analysis towards semantics and sentiment, and propose a novel multimodal dual perception fusion framework (MDPF). Specifically, MDPF contains three core procedures: 1) Generating bootstrapping language-image Knowledge to enrich origin modality space, and utilizing cross-modal contrastive learning for aligning text and image modalities to understand underlying semantics and interactions. 2) Designing dynamic connective mechanism to adaptively match image-text pairs and jointly employing gaussian-weighted distribution to intensify semantic sequences. 3) Constructing a cross-modal graph to preserve the structured information of both image and text data and share information between modalities, while introducing sentiment knowledge to refine the edge weights of the graph to capture cross-modal sentiment interaction. We evaluate MDPF on three publicly available datasets across three tasks, and the empirical results demonstrate the superiority of our proposed model.

Item Metadata

Item Type:	Article
Uncontrolled Keywords:	fake news detection; multimodal affective analysis; multimodal dual perception fusion; multimodal sentiment analysis; sarcasm detection
Divisions:	Faculty of Science and Health Faculty of Science and Health > Computer Science and Electronic Engineering, School of
SWORD Depositor:	Unnamed user with email elements@essex.ac.uk
Depositing User:	Unnamed user with email elements@essex.ac.uk
Date Deposited:	16 Oct 2024 13:42
Last Modified:	22 Oct 2025 01:00
URI:	http://repository.essex.ac.uk/id/eprint/39412

Available files

Accepted Version

Filename: MDPF.pdf

Licence: Creative Commons: Attribution-Noncommercial-No Derivative Works 4.0

Download

Multimodal Dual Perception Fusion Framework for Multimodal Affective Analysis

Abstract

Item Metadata

Share and export

Available files

Accepted Version

Statistics

Altmetrics

Downloads