Argument Mining segmentation using multi-task learning

Author nameLeonidas Tsekouras
TitleArgument Mining segmentation using multi-task learning
Year2017-2018
Supervisor

Georgios Petasis

GeorgiosPetasis

Summary

A problem often present in argument mining tasks is that manual annotation can be expensive and time consuming. This makes the re-use of already existing annotations important. “Transfer Learning” allows us to do that by training a model on a corpus from a thematic domain where we have a lot of annotated data, and apply that knowledge on a corpus from a different domain, for which we may have no annotations. In this work, we focus on claim identification, a fundamental task in argument mining, in a transfer learning setting. A claim is an argument being made in the text, often supported by facts or other statements. Our proposed approach draws inspiration from HATN [1], an approach originally developed for cross-domain sentiment classification. We research whether this model can be applied within argumentation mining, by enhancing the initial approach along several dimensions: a) Reducing the granularity from the document-level to sentence-level, so we can label single sentences as containing (or not) a claim. b) Test multiple embeddings, including a word2vec [2] model trained on the corpora we used. c)Include multiple sentences as contextual information to be used by the model. d)Experiment with under-sampling the majority class and using multiple datasets as the source corpus to handle class imbalance. Our evaluation shows that we can successfully adapt HATN to our task, as its performance matched or exceeded the performance of other state-of-the-art approaches. Also, using our own embeddings, achieves similar results with an orders of magnitude smaller size. Finally we highlight some problems with the available datasets for this task, such as their small size and significant class imbalance.