MOTIVATION: The Full-text index in Minute space (FM-index) is a memory-efficient data structure widely used in bioinformatics for solving the fundamental pattern-matching task of searching for short patterns within a long reference. With the demand for short query patterns, the k-ordered concept has been proposed for FM-indexes. However, few construction algorithms in the state of the art fully exploit this idea to achieve significant speedups in the pan-genome era. RESULTS: We introduce the k-ordered Induced Suffix Sorting (kISS) for efficient construction and utilization of k-ordered FM-indexes. We present an algorithmic workflow for building k-ordered suffix arrays, incorporating two novel strategies to improve time and memory efficiency. We also demonstrate the compatibility of integrating k-ordered FM-indexes with locate operations in FMtree. Experiments show that kISS can improve the construction time, and the generated k-ordered suffix array can also be applied to FMtree without any additional in computation or memory usage. AVAILABILITY: https://github.com/jhhung/kISS. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Efficient Construction and Utilization of k-Ordered FM-indexes with kISS for Ultra-Fast Read Mapping in Large Genomes.
阅读:3
作者:Yang Zheng-Dao, Kuo Hsuan-Yu, Hsieh Po-Wei, Hung Jui-Hung
| 期刊: | Bioinformatics | 影响因子: | 5.400 |
| 时间: | 2024 | 起止号: | 2024 Jun 19; 40(7):btae409 |
| doi: | 10.1093/bioinformatics/btae409 | ||
特别声明
1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。
2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。
3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。
4、投稿及合作请联系:info@biocloudy.com。
