Automated emotion detection using multi-modal physiological signals: a path towards clinical applications

Pansara, Zeel (2025) Automated emotion detection using multi-modal physiological signals: a path towards clinical applications. Doctoral thesis, University of Essex. DOI https://doi.org/10.5526/ERR-00041684

Abstract

The ability to accurately detect and understand human emotions is crucial for various applications, from enhancing human-computer interaction to advancing mental health monitoring and neurorehabilitation. This thesis presents a multimodal emotion detection model that integrates Facial Emotion Recognition (FER), pupil size, and Galvanic Skin Response (GSR) to provide a continuous and subtle analysis of emotional states. Unlike conventional emotion detection systems that rely on single-modality approaches, this research overcomes key limitations by combining physiological signals that capture emotions’ arousal and valence. A key contribution of this study is a robust multimodal emotion detection framework that enhances pupil-based emotion prediction by isolating emotional signals from luminosity effects, improving feature reliability. Across 32 emotionally varied video clips shown to 47 participants, our corrected model achieved strong predictive performance (mean correlation of 0.65 ± 0.12, an R2 − score of 0.43 ± 0.12, and Normalised Root Mean Square Error (N RM SE)) of 0.27 ± 0.036), significantly outperforming models using uncorrected pupil size. These results highlight the importance of addressing environmental confounds and the model’s potential for real-world applications in affective computing. After obtaining pupil size corrected for luminosity, we also extracted features from Facial Emotion Recognition (FER) and Galvanic Skin Response (GSR). We integrated them at a feature-level fusion. We then trained and evaluated an emotion detection machine learning model on the same 47 participants. The model employs a regression-based approach using the Extreme Gradient Boosting (XGBoost) algorithm, a powerful machine learning technique, to fuse these multimodal features. The model achieves higher accuracy than model trained on single physiological features, with a correlation of 0.91 ± 0.041, an R2 of 0.710 ± 0.098, and an N RM SE of 0.183 ± 0.030 for valence and correlation of 0.86 ± 0.061, an R2 of 0.665 ± 0.359, and an N RM SE of 0.187 ± 0.070 for arousal, showcasing its ability to predict emotional states continuously. The model was evaluated on a diverse set of participants, showing robustness to inter subject variability, and was designed with a lightweight architecture suitable for real-time use on wearable and mobile platforms. By addressing challenges such as signal fusion, temporal misalignment, and computational efficiency, this work advances the deployment of multimodal emotion detection systems. It lays the groundwork for emotion-aware technologies in clinical care, neurorehabilitation, and human–computer interaction, enabling continuous and personalised monitoring of emotional states.

Item Metadata

Item Type:	Thesis (Doctoral)
Divisions:	Faculty of Science and Health > Computer Science and Electronic Engineering, School of
Depositing User:	Zeel Pansara
Date Deposited:	13 Oct 2025 08:48
Last Modified:	13 Oct 2025 08:48
URI:	http://repository.essex.ac.uk/id/eprint/41684

Available files

UNSPECIFIED

Filename: 2113603_Zeel Pansara.pdf

Download

Automated emotion detection using multi-modal physiological signals: a path towards clinical applications

Abstract

Item Metadata

Share and export

Available files

UNSPECIFIED

Statistics

Altmetrics

Downloads