A clustering algorithm based on grids for core data and adjacency relationships for edge data

一种基于网格的核心数据聚类算法和基于邻接关系的边缘数据聚类算法

阅读:1

Abstract

Grid-based clustering algorithms have become a crucial method in the field of data mining due to their efficiency. However, they face challenges such as parameter sensitivity, poor adaptability to density variations, and misclassification of edge data. To address these issues, existing research primarily focuses on three directions: (1) optimizing the adaptive selection of grid parameters, which struggles to handle variations in cluster density; (2) improving grid division methods (e.g., multi-granularity or dynamic grids), which have limited effectiveness on complex-shaped data; and (3) integrating other clustering techniques, which enhances accuracy but increases algorithmic complexity. This paper proposes a novel improved grid-based clustering algorithm that determines core grids based on data distribution uniformity rather than absolute density and introduces a clustering strategy for non-core grids based on adjacency relationships. This approach effectively identifies clusters with different densities and reduces dependency on initial parameters (density threshold R and grid partition parameters M). The proposed algorithm integrates grid clustering, partitioning-based clustering, and grid splitting techniques. It employs a regional processing strategy-applying grid clustering to cluster core regions while combining grid and Partitioning techniques for edge regions-to enhance accuracy while maintaining efficiency. Experimental results demonstrate that the proposed algorithm outperforms six other benchmark algorithms on datasets with complex shapes and uneven densities, achieving a balance between efficiency and accuracy.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。