Volume List  / Volume 11 (2)

Article

TRAFFIC VOLUMES PREDICTION USING BIG DATA ANALYTICS METHODS

DOI: 10.7708/ijtte.2021.11(2).01


11 / 2 / 184-198 Pages

Author(s)


Abstract

The use of various advanced traffic data collection systems on one hand, and the development of Big Data technologies for the storage and processing of large amounts of data on the other hand, have enabled the application of various non-parametric methods for traffic volume prediction. In this research, the possibilities of application of supervised machine learning, as a method of Big Data analytics, with the aim to predict various indicators of the traffic volume were investigated. The research was conducted through two case studies. In both studies, for training and testing predictive models, traffic data generated by selected automatic traffic counters on the roads in the Republic of Serbia, in the period from 2011 to 2018, were used. Prediction models were trained, tested and applied using Weka software tool. The most basic data preparation was performed using macros for MS Excel written in VBA (Visual Basic for Applications). In the first case study, the goal was to predict the total volume of traffic by days, on selected sections of state roads in the Republic of Serbia. The datasets used for training and testing of machine learning models in the first case study were prepared using MS Access database, and the prediction results were presented using Excel Pivot Charts. In the second case study, we selected one counting point and performed prediction of the hourly vehicle flow, by directions and in total for both directions. The preparation of data sets, as well as the visualization of the results of the Big Data analysis in the second case study, was performed using programs written in the Python programming language. On the prepared data sets, using Weka software tool, different regression prediction models were trained and tested in both case studies. In the first case study, the best results were received by models based on regression decision trees, while in the second study, models based on Lazy IBk, Random Forest, Random Committee and Random Tree algorithms were among best. In each of the case studies, the best prediction model was selected by comparing model performance measures, such as: correlation coefficient, mean absolute error, and square root of mean square error. The model based on the M5P algorithm has shown the best performance in the first study, while the Lazy IBk algorithm gave the best results in the second study. Using the best predictive models, the prediction of daily or hourly traffic for 2020 was made at selected traffic counting points. Supervised machine learning has proven to be an effective method in predicting the volume of traffic flow.


Download Article

Number of downloads: 146


Acknowledgements:

This work was partially supported by the Ministry of Education, Science and Technological Development of the Republic of Serbia, within the project number 036012.


References:

Aqib, M.; Mehmood, R.; Alzahrani, A.; Katib, I.; Albeshri, A.; Altowaijri, S.M. 2019. Smarter Traffic Prediction Using Big Data, In-Memory Computing, Deep Learning and GPUs, Sensors 19: 2206.

 

Bratsas, C.; Koupidis, K.; Salanova, J. M.; Giannakopoulos, K.; Kaloudis, A.; Aifadopoulou, G. 2020. A comparison of machine learning methods for the prediction of traffic speed in urban places, Sustainability 12(1): 1-15.

 

Breiman, L.; Friedman, J.; Olshen, R.; Stone, C. 1984. Classification and Regression Trees. Belmont, California: Wadsworth. Jain, A. K.; Murty, M. N.; Flynn, P. 1999. Data clustering: a review, ACM Comput Surveys 31(3): 264–323.

 

Kong, F.; Li, J.; Jiang, B.; Zhang, T.; Song, H. 2019. Big data-driven machine learning-enabled traffic flow prediction, Transactions on Emerging Telecommunications Technologies 30(9): 1-13.

 

Lippi, M.; Bertini, M.; Frasconi, P. 2013. Short-term traffic flow forecasting: An experimental comparison of time-series analysis and supervised learning, IEEE Transactions on Intelligent Transportation Systems 14(2): 871–882.

 

Niu, K.; Zhao, F.; Zhang, S. 2013. A Fast Classification Algorithm for Big Data Based on KNN, Journal of Applied Sciences 13(12): 2208-2212.

 

Parvathi, M. S.; Akki, B. 2017. Classified Traffic Volume Study at Ghatekesar Junction, International Journal of Engineering and Techniques 3(6): 420-435.

 

Quinlan, J.R. 1986. Induction of decision trees, Machine Learning 1: 81-106.

 

Quinlan, J.R. 1992. Learning with Continuous Classes. In Proceedings of Australian Joint Conference on Artificial Intelligence, Hobart, Australia, 343-348.

 

Saadatfar, H.; Khosravi, S.; Joloudari, J.H.; Mosavi, A.; Shamshirband, S. 2020. A New K-Nearest Neighbors Classifier for Big Data Based on Efficient Data Pruning, Mathematics 8: 286.

 

Salamanis, A.; Meladianos, P.; Kehagias, D.; Tzovaras, D. 2015. Evaluating the Effect of Time Series Segmentation on STARIMA-Based Traffic Prediction Model. In Proceedings of the IEEE Conference on Intelligent Transportation Systems, ITSC, 2225-2230.

 

Wang, Y.; Witten, I. H. 1996. Induction of model trees for predicting continuous classes. Working Paper 96/23. Hamilton, New Zealand: The University of Waikato.

 

Witten, I. H.; Frank, E.; Hall, M. A.; Pal, C. J. 2017. Data Mining: Practical Machine Learning Tools and Techniques, (4th ed.). Burlington, USA: Morgan Kaufmann.

 

Xie, P.; Li, T.; Liu, J.; Du, S.; Yang, X.; Zhang, J. 2020. Urban flow prediction from spatiotemporal data using machine learning: A survey, Information Fusion 59: 1-12.

 

Xu, Y.; Kong, Q.; Liu, Y. 2013. Short-term traffic volume prediction using classification and regression trees. In Proceedings of the 2013 IEEE Intelligent Vehicles Symposium (IV), Gold Coast, Australia, 493-498.