RQ 1: How can AIS data combined with specific vessel de-

6.1 Summary

6.2.1 RQ 1: How can AIS data combined with specific vessel de-

The existing literature was unable to fully answer this research question which further motivated the developed model proposed in this thesis. Thus, the thesis proposes a method of predicting the future destinations of vessels based on his-torical AIS and specific vessel details. Vessel voyages were defined and trajectories were constructed using historical AIS records. These trajectories were structured as categorical and numerical values by making initial predictions purely based on the spatial trajectories by calculating the Most Similar Trajectory’s Destination (MSTD). The resulting training dataset was extended to include additional vessel details such as the vessels’ segments and sub-segments. Thus, any

classification-Chapter 6: Discussion 86

oriented ML model could be trained to predict voyages’ arrival ports.

RQ 1a: What prediction methods can be used to predict vessel destinations?

In addition to the thesis’ proposed method, existing literature showed a few meth-ods capable of predicting destination ports. The only study found unlimited by specific geographical regions developed a Random Forest (RF) -based trajectory similarity measurement method that was used to find a traveling vessel’s most similar historical trajectory’s destination port similar to that of the MSTD value used in this thesis. They also used the frequencies of port visits to normalize the predictions. In the solution proposed in this thesis, their ML-based trajectory sim-ilarity method could replace the SSPD method when calculating the MSTD value in the training dataset and when making predictions.

In terms of ML models, when the problem of destination prediction is formulated as a classification problem it seems that the most viable models are tree-based ensemble methods such as the Random Forest (RF) or Extreme Gra-dient Boosting (XGBoost) models. In contrast, for short-term trajectory predic-tions, many different models were applied in related works. For these predicpredic-tions, nearest neighbor search-based approaches were common as well as a variety of feedforward neural networks.

RQ 1b: What information can be used to predict vessel destinations?

The related work showed that purely spatial attributes in historical AIS data had been used to make predictions regarding vessels’ future trajectory or destination. A few studies used the vessels’ heading and speed as well as their geographical coor-dinates when making predictions, however, it was most common to only consider trajectories derived from geographical coordinates when making predictions. In the thesis’ proposed solution, the vessels’ departure ports and vessel segmentation proved to be highly impactful on destination predictions.

Furthermore, shipping experts interviewed explained that in addition to vessel segments and sub-segments, vessels’ current drafts (depth underwater) in addition to port restrictions can indicate where vessels will travel. Large vessels are particularly affected as there are few ports that are capable of receiving and loading them. Weather also has a large impact on vessels’ traveling patterns but usually does not affect vessels’ final destination port as this is already decided before the voyage begins. The experts also explained that seasonality may be an impact factor as some wares are only exported during particular seasons. However, in areas affected by ice, it may also impact vessels’ voyage trajectories as some areas are unnavigable for most vessels during winter months such as the Northeast passage.

Chapter 6: Discussion 87

RQ 1c: To what extent do methods proposed in existing work vary in scope of applicability?

Time extent Based on the results from a review of the current literature, most prediction methods did not consider future destination ports, but rather vessels’

future short-term trajectories. The development of such methods was often moti-vated by security improvement and used to detect possible collision scenarios. As collision scenario detection is only relevant in shorter time-frames, these predic-tion methods were limited to restricted time intervals ranging from minutes to a few hours at most.

Geographical extent As most related studies are motivated by goals such as se-curity improvement, port management, and anomaly detection, they are not only limited by temporal extent but also geographical extent. For collision detection, the geographical area is not very relevant because the developed methods should be applicable to any given area, port, or region. Moreover, anomaly detection studies were usually limited to specific geographical regions in order to reduce the amount of noise or irrelevant data. For example, for detecting illegal, or irreg-ular, fishing activity, only fishing vessels in a particular area are considered. The few longer-term prediction methods discovered that did consider logistics were also mostly limited by a single geographical region. Seemingly, this was often a result of limited access to global historical AIS data, or the studies themselves were conducted in collaboration with a specific maritime organization. Only one study was found to consider destination predictions on a global scale independent of both geographical and time limitations.

Data depth In terms of the broadness of data considered for related studies, most studies only considered geographical data. Some studies considering colli-sion detection additionally took advantage of additional navigational attributes of the AIS data such as the vessel’s heading, Course Over Ground (COG), and Speed Over Ground (SOG). The few studies that considered destination port predictions were dependent on port data, however, they only considered vessels’ spatial tra-jectories in their predictions and ignored specific vessel details such as their types or segments.

The fact that most related work is generally motivated by safety improve-ment and collision detection reflects the original intent behind the AIS initiative.

The main intention behind the AIS initiative was not economical, or commercial, in nature, but rather implemented for safety and navigation reasons. However, in recent years, the commercial shipping industry has begun using AIS for commer-cial purposes as it has become a trusted source of information. Thus, it is probable that more studies will focus on AIS for destination prediction and logistics in the future.

Chapter 6: Discussion 88

RQ 1d: How can the validity of predictions made based on different predic-tion methods be established?

There were many different validation approaches taken in the existing literature.

The most common validation method included using some manner ofk-fold cross-validation with multiple performance metrics such as F1-score, Mean Distance Error (MDE), and accuracy. Error measurements based on distances were mostly used in short-term predictions that required high positional accuracy, however, it was also applied for a few papers that considered destination prediction as well.

In these studies, the distance from the predicted destination port was measured from the actual destination port. This provides further insight into the level of error for incorrect predictions. For example, if a predicted arrival port was wrong but very close to the actual arrival port, the trajectory-based prediction did still perform quite well. This is a good candidate for future work for this thesis as the proposed evaluation process did not provide much insight into the level of error for incorrect predictions.

6.2.2 RQ 2: What is the impact of vessel segmentation by type, size,

In document Vessel destination forecasting based on historical AIS data (sider 102-105)