An efficient framework for obtaining the initial cluster centers

一种获取初始聚类中心的有效框架

阅读:1

Abstract

Clustering is an important tool for data mining since it can determine key patterns without any prior supervisory information. The initial selection of cluster centers plays a key role in the ultimate effect of clustering. More often researchers adopt the random approach for this purpose in an urge to get the centers in no time for speeding up their model. However, by doing this they sacrifice the true essence of subgroup formation and in numerous occasions ends up in achieving malicious clustering. Due to this reason we were inclined towards suggesting a qualitative approach for obtaining the initial cluster centers and also focused on attaining the well-separated clusters. Our initial contributions were an alteration to the classical K-Means algorithm in an attempt to obtain the near-optimal cluster centers. Few fresh approaches were earlier suggested by us namely, far efficient K-means (FEKM), modified center K-means (MCKM) and modified FEKM using Quickhull (MFQ) which resulted in producing the factual centers leading to excellent clusters formation. K-means, which randomly selects the centers, seem to meet its convergence slightly earlier than these methods, which is the latter's only weakness. An incessant study was continued in this regard to minimize the computational efficiency of our methods and we came up with farthest leap center selection (FLCS). All these methods were thoroughly analyzed by considering the clustering effectiveness, correctness, homogeneity, completeness, complexity and their actual execution time of convergence. For this reason performance indices like Dunn's Index, Davies-Bouldin's Index, and silhouette coefficient were used, for correctness Rand measure was used, for homogeneity and completeness V-measure was used. Experimental results on versatile real world datasets, taken from UCI repository, suggested that both FEKM and FLCS obtain well-separated centers while the later converges earlier.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。