| Author name | Dimitris Sofos |
|---|---|
| Title | An Ensemble Method for Early Time Series Classification |
| Year | 2024-2025 |
| Supervisor | Charilaos Akasiadis CharilaosAkasiadis |
The purpose of this work is to propose a novel combination of ensemble learning techniques with Early Time Series Classification (ETSC) to improve prediction accuracy and minimize the decision-making time in situations and applications that require fast and accurate results, such as fault detection, medical diagnosis, and financial forecasting. The central idea combines the Stacking ensemble technique with two Early Time Series Classification (ETSC) models, TEASER and ECEC and two standard Time Series Classification (TSC) models the MLSTM and XCM. Each model is trained separately to leverage the unique strengths of each algorithm, aiming to build a robust base model that delivers the best possible results. These results are then passed to a meta-model for the final prediction. The methodology integrates four established models as base learners, two explicitly designed for Early Time Series Classification (TEASER and ECEC), and two for general Time Series Classification (MLSTM and XCM). Each model contributes to extracting informative patterns from partial time series, supporting timely and accurate classification decisions.
TEASER is used for its capability in logistic regression and for predicting class probabilities at each snapshot. MLSTM extracts local features and is effective with multivariate time series. ECEC, on the other hand, trains multiple classifiers at different time points, and XCM supports parallel 1D and 2D convolutional layers, providing feature extraction per variable and across variables over time and offering model feature attribution. The predictions from all base models at each time point are collected and passed into a single meta-model, which can be implemented either as a Random Forest, capable of learning complex output combinations, or as a simpler Logistic Regression model. The meta-model is trained at different time points to learn which base model is more reliable at each stage. All the above are evaluated using benchmark datasets from UEA and UCR, as well as biological and maritime datasets, in order to assess the performance of the new model through experiments. The evaluation criteria are the Accuracy, F1-score, Earliness and the Harmonic Mean of Earliness and Accuracy. As will be shown, the proposed method performed better than the individual models in several cases, particularly on multivariate, noisy, and class-imbalanced datasets, thus fulfilling the primary objectives of this study.