Evaluation of Video Soundtracks using Machine Learning. Addressing the issues of data availability, feature extraction and classification

Author nameGeorgios Touros
TitleEvaluation of Video Soundtracks using Machine Learning. Addressing the issues of data availability, feature extraction and classification
Year2018-2019
Supervisor

Theodoros Giannakopoulos

TheodorosGiannakopoulos

Summary

The aim of this thesis is to address the challenges of combining multimodal data to evaluate video soundtracks. To tackle tasks in the field of soundtrack generation, retrieval, or evaluation, data needs to be collected from as many relevant modalities as possible, such as audio, video, and symbolic representations of music. We propose a method of collecting relevant data from all of these modalities, and from them, we attempt to describe and extract a comprehensive multimodal feature library. We construct a database by applying our method on a small set of available data from the three relevant modalities. We implement and tune a classifier in our constructed database of features with adequate results. The classifier attempts to discriminate between real and fake examples of video soundtracks. Finally, we describe some possible improvements on the methods, and we point at some use-cases and directions for future attempts at this and adjacent tasks.