Fact-sentiment Incongruity Combination Network for Multimodal Sarcasm Detection

Lu, Qiang and Long, Yunfei and Sun, Xia and Feng, Jun and Zhang, Hao (2024) Fact-sentiment Incongruity Combination Network for Multimodal Sarcasm Detection. Information Fusion, 104 (104). p. 102203. DOI https://doi.org/10.1016/j.inffus.2023.102203

Abstract

Multimodal sarcasm detection aims to identify whether the literal expression is contrary to the authentic attitude within multimodal data. Sarcasm incongruity method has been successfully applied to multimodal sarcasm detection, due to its ability to flexibly capture the intrinsic differences between modalities. However, previous incongruity methods primarily focused on the semantic level, often overlooking more specific forms of sarcasm incongruity. Sarcasm incongruity, in particular, encompasses fact incongruity, sentiment incongruity, and combination incongruity. Therefore, we propose a fact-sentiment incongruity combination network from a novel perspective, which draws the multimodal sarcastic relations by exploring the multimodal factual disparities, sentiment incongruity, and combination fusion. Specifically, we design a dynamic connecting component calculating dynamic routing probability weights via graph attention and mask routing matrices, which selects the most suitable image-text pairs to capture fact incongruity between images and text. Then, we retrieve sentiment relations between text tokens and image objects using external sentiment knowledge to reconstruct edge weights in the cross-modal graph matrix to capture sentiment incongruity. Furthermore, we introduce a combination incongruity fusion layer and cross-modal contrastive loss to fuse fact incongruity and sentiment incongruity for further enhancing the incongruity representations. Extensive experiments and further analyses on publicly available datasets demonstrate the superiority of our proposed model.

Item Metadata

Item Type:	Article
Uncontrolled Keywords:	Combination incongruity fusion; Cross-modal graph; Dynamic connecting component; Multimodal sarcasm detection; Sarcasm incongruity
Divisions:	Faculty of Science and Health Faculty of Science and Health > Computer Science and Electronic Engineering, School of
SWORD Depositor:	Unnamed user with email elements@essex.ac.uk
Depositing User:	Unnamed user with email elements@essex.ac.uk
Date Deposited:	09 Jan 2024 17:18
Last Modified:	16 Aug 2025 05:56
URI:	http://repository.essex.ac.uk/id/eprint/37290

Available files

Accepted Version

Filename: FSICN.pdf

Licence: Creative Commons: Attribution-Noncommercial-No Derivative Works 4.0

Download

Fact-sentiment Incongruity Combination Network for Multimodal Sarcasm Detection

Abstract

Item Metadata

Share and export

Available files

Accepted Version

Statistics

Altmetrics

Downloads