Clustering-based progressive alignment with fuzzy logic (CPA-FL)

基于聚类的模糊逻辑渐进对齐(CPA-FL)

阅读:2

Abstract

Multiple sequence alignment (MSA) is a fundamental tool for identifying conserved regions and inferring molecular structure, function, and evolutionary relationships. Despite decades of progress, aligning large and evolutionarily diverse sequence sets remains computationally challenging and prone to error propagation in order-dependent pipelines. Here, we present a comprehensive performance evaluation of CPA-FL (Clustering-based Progressive Alignment with Fuzzy Logic), a flexible MSA framework designed to improve robustness through graph-based clustering and fuzzy membership refinement. CPA-FL was benchmarked against widely used alignment tools across large protein families and curated reference datasets. Two large-scale protein families-HEN1 (438 sequences) and HST (477 sequences)-were used to assess alignment quality under multiple clustering and thresholding strategies. Results show that moderate, well-defined clustering combined with progressive profile HMM merging yields the highest SP per aligned column and BLOSUM62-weighted SP scores, indicating improved local alignment accuracy and preservation of evolutionary signal. In contrast, overly aggressive clustering under permissive threshold settings led to fragmentation and reduced biological coherence. Viterbi-based profile HMM merging produced the most compact alignments, reflecting efficient gap handling, while progressive profile HMM merging achieved enhanced local accuracy through iterative profile refinement. Comparative benchmarking against Clustal Omega, MUSCLE, Kalign, MAFFT, and T-Coffee demonstrated that CPA-FL configurations achieve competitive or superior performance, particularly in conserved regions. Statistical evaluation using Friedman non-parametric tests on BALiBASE 3.0 reference datasets confirmed significant performance differences across methods (P < 0.00001). Together, these results establish CPA-FL as a scalable and biologically meaningful framework for large-scale MSA, offering explicit control over clustering granularity while mitigating the brittleness of traditional progressive alignment approaches.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。