Grid-based k Estimation Method for Efficient k-Means Clustering 


Vol. 14,  No. 10, pp. 796-803, Oct.  2025
https://doi.org/10.3745/TKIPS.2025.14.10.796


PDF
  Abstract

As the number of data rapidly increases and its types become more diverse, various types of techniques have been studied to analyze them. Clustering analysis is a representative unsupervised learning technique that partitions a given data into more similar clusters. Since it is difficult to estimate the number of clusters k before performing clustering, previous studies have performed clustering for a certain range of k and then derived the optimal k through score-based evaluation. This has the problem that large-scale calculations are required in a situation where the number of data increases significantly. In this study, to alleviate this problem, we propose a method that reduces the dimension of data, generates grid-based approximate clusters, and estimates the optimal k from the original data set through evaluation using the approximate clusters. The proposed technique was evaluated for performance comparison with the Mini Batch-based method, which is the latest clustering analysis optimization technique, based on a number of synthetic data sets and real data sets, and we demonstrate efficiency in terms of similarity of silhouette scores and reduction of execution time.

  Statistics


  Cite this article

[IEEE Style]

C. Moon, M. H. Kim, J. Min, "Grid-based k Estimation Method for Efficient k-Means Clustering," The Transactions of the Korea Information Processing Society, vol. 14, no. 10, pp. 796-803, 2025. DOI: https://doi.org/10.3745/TKIPS.2025.14.10.796.

[ACM Style]

Cheolhan Moon, Min Hyung Kim, and Jun-Ki Min. 2025. Grid-based k Estimation Method for Efficient k-Means Clustering. The Transactions of the Korea Information Processing Society, 14, 10, (2025), 796-803. DOI: https://doi.org/10.3745/TKIPS.2025.14.10.796.