Predicting Component Defects for the Shipping Industry with the use of Machine Learning Algorithms

Author nameAnastasios Makris
TitlePredicting Component Defects for the Shipping Industry with the use of Machine Learning Algorithms

Iraklis - Angelos Klampanos

Iraklis - AngelosKlampanos


The constantly increasing competitive environment in shipping industry, in combination with the data science evolution and the progress in computer science and in computer software generally, have made the need for improving the operational processes more important than ever. A field that most of the shipping companies focus, is maintenance. The high quality of the maintenance and the commitment in following the processes, help maritime organizations to provide optimal conditions and resources for propelling the ship safely at sea. The maintenance in shipping industry is not just a periodical set of actions followed by the seafarers, but a constantly developing process through which the necessary work is being per- formed on time. The maintenance nowadays is being planned and carried out based on the analysis of the vessel’s ecosystem and the relevant information. PM is being performed after monitoring and analysing data gathered from the PMS system, the installed sensors and other sources indicative of the health of the vessel. This thesis is related to Predictive Maintenance. The aim of this study is to create models which will try to give answers to relevant to PM questions, such as if a component will be defective in the next year, in which quarter of the year and the severity of the defect. In particular, for this work we collected data from a PMS system of a relatively large shipping company, we processed those and made the necessary calculations so as to create a dataset that was going to be fit to Machine Learning Algorithms in order for their performance to be evaluated. Furthermore, as the abovementioned dataset did not contain sensors’ data, we used another dataset, found online, which was consisted of historical data as well as information coming from the sensors installed in the components. Due to the nature of the dataset itself and the problem we tried to solve the datasets used in this work suffered from a severe class imbalance issue. Additionally the components included in the datasets were of different types and importance for the vessels resulting in lack of uniformity. For instance the datasets contained components from the deck and at the same time critical main engine components. After using the proper strategies and techniques those issues were resolved and the performance of the models was improved. It should be noted that Random Forest and SVM were two algorithms that had a good performance, something that was already mentioned in the previous relevant works.