Fast sequence alignment for centromeres with RaMA

利用 RaMA 进行着丝粒快速序列比对

阅读:2

Abstract

The release of the first draft of the human pangenome has revolutionized genomic research by enabling access to complex regions like centromeres, composed of extra-long tandem repeats (ETRs). However, a significant gap remains as current methodologies are inadequate for producing sequence alignments that effectively capture genetic events within ETRs, highlighting a pressing need for improved alignment tools. Inspired by UniAligner, we developed a rare match aligner (RaMA), using rare matches as anchors and two-piece affine gap cost to generate complete pairwise alignment that better captures genetic evolution. RaMA also employs parallel computing and the wavefront algorithm to accelerate anchor discovery and sequence alignment, achieving up to 13.66 times faster processing using only 11% of UniAligner's memory. Downstream analysis of simulated data and the CHM13 and CHM1 higher-order repeat (HOR) arrays demonstrates that RaMA achieves more accurate alignments, effectively capturing true HOR structures. RaMA also introduces two methods for defining reliable alignment regions, further refining and enhancing the accuracy of centromeric alignment statistics.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。