Author name | Dimitrios Tsintzouras |
---|---|
Title | Aspect Based Sentiment Analysis on hotel reviews in Greek |
Year | 2018-2019 |
Supervisor | Georgios Petasis GeorgiosPetasis |
In recent years, a rising number of businesses have used the feedback mechanism of reviews for their products and services in order to adapt to changing consumer demands. Sentiment identification from texts (Sentiment Analysis) is critical for making this work more automated and efficient. Sentiment analysis focuses on categorizing a text’s overall sentiment, which may leave out essential information such as distinct sentiments associated with different aspects of the text. Aspect-Based Sentiment Analysis (ABSA) is a more difficult process of determining the sentiment of certain targets of a text. As a result of recent breakthroughs in deep learning, the research community has become more interested in ABSA, and various architectures that can produce state-of-the-art results have been suggested. Most of these approaches are usually applied on English language datasets and it is clear that efforts to apply them on other languages are limited. The goal of this thesis is to examine the topic of aspect-based sentiment analysis in the Greek language. Using, as a starting point, a small dataset with hotel reviews in the Greek language, firstly we annotated the documents in order to specify the aspects and their corresponding polarity. Then, some of the state-of-the-art studies used for this task in English language were investigated and altered slightly in order to apply them in our Greek dataset. Specifically, several architectures are ap- plied, such as Recurrent Neural Networks (RNNs) and the pretrained Bidirectional Encoder Representations from Transformers (BERT) multilingual model. Finally we propose a model, in essence an extension of the high-scored state-of- the-art model, named LCF-BERT, with the insert of a lexicon in its architecture in order to further improve its performance. The obtained results, especially for the neutral sentiment class, which is the class with the less instances in our dataset, are encouraging, underlying the robustness of the proposed approach.