Open Access Open Access  Restricted Access Subscription or Fee Access

A Study of Normalization Approach on K-Means Clustering Algorithm

U. Dauda, B. M. Ismail

Abstract


K-means clustering is a widely used tool for cluster analysis due to its conceptual simplicity and computational efficiency. However, its performance can be distorted when clustering high-dimensional data where the number of variables becomes relatively large and many of them may contain no information about the clustering structure. In this paper, we point out without data normalization, some problems will arise from the many applications of data mining. The effectiveness of the normalization approach on k-means clustering is also demonstrated through a variety of numerical experiments basically z-score, Min-Max and decimal scaling methods. Experimental analysis shows that the z-score performs well and is much better accurate among the three normalization procedures, due to which the number of iterations is reduced by the method.

Keywords


k-means clustering, data normalization, z-score, min-max, decimal scaling, infectious disease.

Full Text:

PDF

Refbacks

  • There are currently no refbacks.


Disclaimer/Regarding indexing issue:

We have provided the online access of all issues and papers to the indexing agencies (as given on journal web site). It’s depend on indexing agencies when, how and what manner they can index or not. Hence, we like to inform that on the basis of earlier indexing, we can’t predict the today or future indexing policy of third party (i.e. indexing agencies) as they have right to discontinue any journal at any time without prior information to the journal. So, please neither sends any question nor expects any answer from us on the behalf of third party i.e. indexing agencies.Hence, we will not issue any certificate or letter for indexing issue. Our role is just to provide the online access to them. So we do properly this and one can visit indexing agencies website to get the authentic information.