BitBIRCH Clustering Refinement Strategies

BitBIRCH聚类优化策略

阅读:1

Abstract

Chemical libraries are becoming not only increasingly bigger, but they are doing so at an accelerated pace. Keeping up with this explosion in chemical data demands more than just hardware upgrades, we need dramatically more efficient algorithms as well. We have been working in this direction, with the introduction of the iSIM framework, which uses n-ary similarity to speed up the processing of very large sets. Recently, we showed how to use this technique to cluster billions of molecules with unprecedented efficiency through the BitBIRCH algorithm. In this Application Note we present a package fully-dedicated to expanding on the BitBIRCH method, including multiple options that give the user appreciable control over the tree structure, while dramatically improving the quality of the final partitions. Remarkably, this is achieved without compromising the efficiency of the original method. We also present new post-processing tools that help dissect the clustering information, as well as ample examples showcasing the new functionalities. BitBIRCH is publicly available at: https://github.com/mqcomplab/bitbirch.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。