FFC: a scalable FASTA compressor

FFC:一种可扩展的FASTA压缩器

阅读:1

Abstract

SUMMARY: FASTA is a widely used text-based format for storing nucleotide and protein sequences. The existing FASTA compressors usually focus on (slightly) improving the compression ratio, not on practical performance. We present FFC, a scalable FASTA compressor that achieves average compression speeds 4.7× and 11.4× higher than two high-performance compressors, zstd and NAF, respectively, across a benchmark set of seven single genomes. It also delivers average decompression speeds 3.5× and 2.7× higher than zstd and NAF, respectively. Although a chunk-based zstd variant with parallel decompression, pzstd, almost matches FFC speed, its compression ratio is on average by 23% worse than FFC's. For the experiment, a 14-core workstation and a RAM disk (to reduce the impact of I/O) were used. AVAILABILITY AND IMPLEMENTATION: FFC is freely available at github.com/kowallus/ffc and also as a Zenodo repository at 10.5281/zenodo.18892353, and the used datasets at 10.5281/zenodo.18873744.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。