Alshahrani, Mohammed (2020) Exploring embedding vectors for emotion detection. PhD thesis, University of Essex.
Alshahrani, Mohammed (2020) Exploring embedding vectors for emotion detection. PhD thesis, University of Essex.
Alshahrani, Mohammed (2020) Exploring embedding vectors for emotion detection. PhD thesis, University of Essex.
Abstract
Textual data nowadays is being generated in vast volumes. With the proliferation of social media and the prevalence of smartphones, short texts have become a prevalent form of information such as news headlines, tweets and text advertisements. Given the huge volume of short texts available, effective and efficient models to detect the emotions from short texts become highly desirable and in some cases fundamental to a range of applications that require emotion understanding of textual content, such as human computer interaction, marketing, e-learning and health. Emotion detection from text has been an important task in Natural Language Processing (NLP) for many years. Many approaches have been based on the emotional words or lexicons in order to detect emotions. While the word embedding vectors like Word2Vec have been successfully employed in many NLP approaches, the word mover’s distance (WMD) is a method introduced recently to calculate the distance between two documents based on the embedded words. This thesis is investigating the ability to detect or classify emotions in sentences using word vectorization and distance measures. Our results confirm the novelty of using Word2Vec and WMD in predicting the emotions in short text. We propose a new methodology based on identifying “idealised” vectors that cap- ture the essence of an emotion; we define these vectors as having the minimal distance (using some metric function) between a vector and the embeddings of the text that contains the relevant emotion (e.g. a tweet, a sentence). We look for these vectors through searching the space of word embeddings using the covariance matrix adap- tation evolution strategy (CMA-ES). Our method produces state of the art results, surpassing classic supervised learning methods.
Item Type: | Thesis (PhD) |
---|---|
Subjects: | Q Science > QA Mathematics > QA75 Electronic computers. Computer science |
Divisions: | Faculty of Science and Health > Computer Science and Electronic Engineering, School of |
Depositing User: | Mohammed Alshahrani |
Date Deposited: | 06 Nov 2020 10:24 |
Last Modified: | 06 Nov 2020 10:24 |
URI: | http://repository.essex.ac.uk/id/eprint/29037 |
Available files
Filename: Mohammed_Alshahrani.pdf