A Comparison Study between Data Mining Algorithms over Classification Techniques in Squid Dataset

Fartash. Haghanikhameneh, Payam. Hassany Shariat Panahy, Nasim. Khanahmadliravi, Seyed Ahmad. Mousavi


Classification is one of the most important supervised learning techniques in data mining. Classification algorithms can be extremely beneficial to interpret and demonstrate bandwidth usage pattern and predict the required bandwidth for different groups in distinct time interval, having the intention of improving efficiency. The dataset used in this study was collected over a year from a Squid proxy server’s log file, on access.log file, from a computer institute. This study compares various classification algorithms to predict the bandwidth usage pattern in different time intervals among different groups of users in the network. Different classification algorithms including Decision Tree and Naïve Bayesian are compared using Orange, a data mining tool. The results of the experiment showed that the Decision Tree algorithm achieved accuracy and efficiency in predicting the required bandwidth inside the network.

