ALGA: Adaptive lexicon learning using genetic algorithm for sentiment analysis of microblogs
- Citation data:
Knowledge-Based Systems, ISSN: 0950-7051, Vol: 122, Page: 1-16
- Publication Year:
- Computer Science; Business, Management and Accounting; Decision Sciences
Sentiment analysis is about classifying opinions expressed in text. The aim of this study is to improve polarity classification of sentiments in microblogs by building adaptive sentiment lexicons. In the proposed method, corpora-based and lexicon-based approaches are combined and lexicons are generated from text. The sentiment classification is formulated as an optimization problem, in which the goal is to find optimum sentiment lexicons. A novel genetic algorithm is then proposed to solve this optimization problem and find lexicons to classify text. The algorithm generates adaptive sentiment lexicons, and then a meta-level feature is extracted based on it, which is then used alongside Bing Liu's lexicon and n-gram features. The experiments are conducted on six datasets. In terms of accuracy, the results outperform the state-of-the-art methods proposed in the literature in two of the datasets. Also, in four of the datasets, the proposed approach outperforms in terms of F-measure. Applying the proposed method on six datasets, the accuracy is higher than 80% in all six datasets and the F-measure is higher than 80% in four of these datasets. Using the sentiment lexicons created by the proposed algorithm, one can get a better understanding of the specific language and culture of Twitter users and sentiment orientation of words in different contexts. It is also shown that it is useful not to omit the conventional stop-words, as each word can have its sentimental implications.