Weakly-Supervised Fine-Grained Semantic Indexing of Biomedical Literature using Citations

Author nameEleni Voulgari
TitleWeakly-Supervised Fine-Grained Semantic Indexing of Biomedical Literature using Citations
Year2017-2018
Supervisor

Anastasia Krithara

AnastasiaKrithara

Summary

Semantic indexing of biomedical literature is essential for plenty of the research areas in the field of bioinformatics, such as data mining and knowledge retrieval. Annotations of biomedical research publications with Medical Subject Headings (MeSH) result in coarse grained indexing, due to the fact that the terms assigned are the MeSH descriptors, which may correspond to various related but disparate biomedical concepts. These semantic annotations may not provide adequate information to professionals in need of extracting more specific domain knowledge. In this Master’s thesis, we suggest a methodology, in which a training dataset is enriched with citations’ or/and references’ semantic features and then used to train an available concept-level automatic annotator, so as to investigate possible changes in its performance. This approach is evaluated on Alzheimer’s Disease MeSH related narrower concepts. The results indicate that, under the proper choice of classifiers and the appropriate definition of the input parameters, the performance of the classifiers, trained on the enriched dataset can surpass that of the base classifiers. The best classifier’s performance is obtained, when the training dataset contains the semantic features from both citations and references.