Multimodal summarization of user generated videos from wearable cameras

Author name	Theodoros Psallidas
Title	Multimodal summarization of user generated videos from wearable cameras
Year	2019-2020
Supervisor	Theodoros Giannakopoulos TheodorosGiannakopoulos

Summary

The aim of this thesis is to construct a video summarization procedure to distill a video sequence in a more compact, and at the same time, informative form. The exponential growth of user-generated content has increased the need for efficient video summarization schemes. However, most approaches underestimate the power of aural features, while they are designed to work mainly on commercial/professional videos. In this work, we present an approach that uses both audio and visual features, in order to create video summaries from user-generated videos. Our approach produces dynamic video summaries, i.e., comprising of the most “important” parts of the original video, which are arranged so as to preserve their temporal order. We use supervised knowledge from both the aforementioned modalities and train a binary classifier, which learns to recognize the important parts of videos. Moreover, we present a novel user-generated dataset which contains videos from several categories. Every 1-sec part of each video from our dataset has been annotated by more than three annotators as being important or not. We evaluate our approach using several classification strategies based on audio, video, and fused features. Our experimental results illustrate the potential of our approach.

© Εθνικό Κέντρο Έρευνας Φυσικών Επιστημών «Δημόκριτος» για το Ινστιτούτο Πληροφορικής & Τηλεπικοινωνιών και Πανεπιστήμιο Πελοποννήσου για το Τμήμα Πληροφορικής και Τηλεπικοινωνιών. Τα περιεχόμενα του ιστοχώρου «ΠΜΣ Επιστήμη των Δεδομένων» μπορούν να αναπαραχθούν ελεύθερα για μη εμπορικούς σκοπούς.

Multimodal summarization of user generated videos from wearable cameras

Summary

2020-2021

2019-2020

Επικοινωνια