Extending the Bicriterion Approach for Anticlustering: Exact and Hybrid Approaches

扩展双准则反聚类方法:精确方法和混合方法

阅读:1

Abstract

Numerous applications in psychological research require that a data set is partitioned via the inverse of a clustering criterion. This anticlustering seeks for high similarity between groups (maximum diversity) or high pairwise dissimilarity within groups (maximum dispersion). Brusco et al. (2020) proposed a bicriterion heuristic (BILS) that simultaneously seeks for maximum diversity and dispersion, introducing the bicriterion approach for anticlustering. We investigate if the bicriterion approach can be improved using exact algorithms that guarantee globally optimal criterion values. Despite the theoretical computational intractability of anticlustering, we present a new exact algorithm for maximum dispersion that scales to quite large data sets ( [Image: see text] ). However, a fully exact bicriterion approach was only feasible for small data sets (about [Image: see text] ). We therefore developed hybrid approaches that maintain optimal dispersion but use heuristics to maximize diversity on top of it. In a simulation study and an example application, we compared several hybrid approaches. An adaptation of BILS that initiates each iteration with a partition having optimal dispersion (BILS-Hybrid-All) performed best across a variety of data inputs. All of the methods developed here as well as the original BILS algorithm are available via the free and open-source R package anticlust.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。