Barakbah, Ali Ridho and Kiyoki, Yasushi (2010) A Fast Algorithm for K-Means Optimization using Pillar Algorithm. In: The 2nd International Workshop with Mentors on Databases, Web and Information Management for Young Researchers (iDB) 2010, 2-4 Aug 2010, Tokyo, Japan.
PDF (iDB 2010) - Published Version Restricted to Registered users only Available under License Creative Commons Attribution No Derivatives. Download (579Kb) |
Abstract
The ability of K-means to cluster a kind of huge data very quickly often pays to the incorrect clustering results because of its designated initial centroids as cluster centers which are generated randomly. However, the efforts to improve the precision of K-means clustering results may take a highly execution time by optimizing the determination of initial centroids for K-means. This paper presents a fast algorithm for K-means optimization by improving our Pillar algorithm. The Pillar algorithm [19] is inspired by the thought process of determining a set of pillars’ locations in order to make a stable house or building. The algorithm considers the pillars’ placement which should be located as far as possible from each other to withstand against the pressure distribution of a roof, as identical to the number of centroids amongst the data distribution. Hence, the Pillar algorithm designates positions of initial centroids by using the farthest accumulated distance between them. This algorithm is very effective to position the initial centroids for K-means and improve the precision of the clustering results. However, the algorithm takes highly computational time for clustering huge data which often have many outliers, since its complexity O((k+h+1) n) (where k=number of clusters, h=number of outliers, and n=number of data items) to position the initial centroids. In this paper, we reduce the complexity of our previous work Pillar algorithm by excluding the designated initial centroids’ neighbors from next iterations so that the complexity will decrease in line with iterations and speed up the execution time. The performance of our improved Pillar algorithm is examined in the precision rate and computational time with several benchmark datasets as well as image data and compared the existing algorithms. The experimental results show the improved solution using the proposed approach.
Item Type: | Conference or Workshop Item (Paper) |
---|---|
Subjects: | Q Science > QA Mathematics > QA75 Electronic computers. Computer science |
Divisions: | Faculty of Engineering, Science and Mathematics > School of Electronics and Computer Science |
Depositing User: | Dr. Ali Ridho Barakbah |
Date Deposited: | 22 Mar 2015 12:17 |
Last Modified: | 22 Mar 2015 12:17 |
URI: | http://repo.pens.ac.id/id/eprint/2737 |
Actions (login required)
View Item |