Antai, Roseline (2016) A New Hybrid Approach to Sentiment Classification. PhD thesis, University of Essex.
Antai, Roseline (2016) A New Hybrid Approach to Sentiment Classification. PhD thesis, University of Essex.
Antai, Roseline (2016) A New Hybrid Approach to Sentiment Classification. PhD thesis, University of Essex.
Abstract
With the advancement of the World Wide Web, opinion sharing online has gained a lot of popularity. These opinions are utilized for decision making, market analysis, as well as other applications. The need to harness these opinions, and the motivation behind this need has led to the development and subsequent advancement of the field of Sentiment Analysis. Various issues have arisen from these, such as difficulty in locating these opinions in a body of text, as well as determining the sentiment/polarity of these opinions. To tackle the issue of opinion polarity determination, a number of classification approaches have been developed. These approaches have focused on opinion classification at various levels, such as document, sentence and aspect levels. Most document level approaches treat documents as a bag of words during the classification process, and hence classify them as a whole. The problem with this is that there could be a mixture of opinions directed towards various aspects, within a document. It is therefore imperative to utilize a classification approach which takes into account these constituent opinions. This is the focus of classification approaches which work at the aspect level. Another important factor in the issue of sentiment/polarity classification is the choice of the classification approach. This can be machine learning, lexical/lexicon-based, and more recently, hybrid. The machine learning approaches have the benefits of carrying out classification with high accuracies, and efficiently handling large feature sets, which makes them a favourite choice where high accuracies are desired. They however also have the drawback of difficulty in adaptability, due to the domain dependency of sentiment words. The pure lexicon-based approaches do not achieve the accuracy of the machine learning approaches, but are said to offer more explainable results and take into consideration the information in lexicons. In this work, we present a novel hybrid approach, which incorporates information from lexicons in a machine learning classifier, and takes as features various linguistic knowledge sources. Our novel hybrid approach utilizes transitive dependencies to incorporate the opinions expressed towards different aspects of a document in determining the polarity classification of the whole document. The domain dependency of sentiment words is also addressed through the use of composite features and a domain specific lexicon created in this work. It was found that the use of transitive dependencies in an aspect-focused classification is a promising area, which has the potential of improving aspect based classification once the aspects have been properly determined. It was also found that although using composite features does not necessarily improve the classification accuracy, it gives rise to context rich classifiers, and the domain specific lexicon generated performed on par with the widely used generic lexicon, SentiWordNet.
Item Type: | Thesis (PhD) |
---|---|
Subjects: | P Language and Literature > P Philology. Linguistics Q Science > QA Mathematics > QA75 Electronic computers. Computer science |
Divisions: | Faculty of Science and Health > Computer Science and Electronic Engineering, School of |
Depositing User: | Roseline Antai |
Date Deposited: | 24 May 2016 10:46 |
Last Modified: | 23 May 2018 01:00 |
URI: | http://repository.essex.ac.uk/id/eprint/16780 |
Available files
Filename: Antai_thesis.pdf