Automatic stance detection on political discourse in Twitter
MetadataShow full item record
The majority of opinion mining tasks in natural language processing (NLP) have been focused on sentiment analysis of texts about products and services while there is comparatively less research on automatic detection of political opinion. Almost all previous research work has been done for English, while this thesis is focused on the automatic detection of stance (whether he or she is favorable or not towards important political topic) from Twitter posts in Catalan, Spanish and English. The main objective of this work is to build and compare automatic stance detection systems using supervised both classic machine and deep learning techniques. We also study the influence of text normalization and perform experiments with differentt methods for word representations such as TF-IDF measures for unigrams, word embeddings, tweet embeddings, and contextual character-based embeddings. We obtain state-of-the-art results in the stance detection task on the IberEval 2018 dataset. Our research shows that text normalization and feature selection is important for the systems with unigram features, and does not affect the performance when working with word vector representations. Classic methods such as unigrams and SVM classifier still outperform deep learning techniques, but seem to be prone to overfitting. The classifiers trained using word vector representations and the neural network models encoded with contextual character-based vectors show greater robustness.