A comparison between machine and deep learning models on high stationarity data

Santoro, Domenico; Ciano, Tiziana; Ferrara, Massimiliano

doi:10.1038/s41598-024-70341-6

Advances in sensor, computing, and communication technologies are enabling big data analytics by providing time series data. However, conventional models struggle to identify sequence features and forecast accuracy. This paper investigates time series features and shows that some machine learning algorithms can outperform deep learning models. In particular, the problem analyzed concerned predicting the number of vehicles passing through an Italian tollbooth in 2021. The dataset, composed of 8766 rows and 6 columns relating to additional tollbooths, proved to have high stationarity and was treated through machine learning methods such as support vector machine, random forest, and eXtreme gradient boosting (XGBoost), as well as deep learning through recurrent neural networks with long short-term memory (RNN-LSTM) cells. From the comparison of these models, the prediction through the XGBoost algorithm outperforms competing algorithms, particularly in terms of MAE and MSE. The result highlights how a shallower algorithm than a neural network is, in this case, able to obtain a better adapt