Abstract
SUMMARY: Minimizer digestion is an increasingly common component of bioinformatics tools, including tools for de Bruijn graph assembly and sequence classification. We describe a new open source tool and library to facilitate efficient digestion of genomic sequences. It can produce digests based on the related ideas of minimizers, modimizers or syncmers. Digest uses efficient data structures, scales well to many threads, and produces digests with expected spacings between digested elements. AVAILABILITY AND IMPLEMENTATION: Digest is implemented in C++17 with a Python API, and is available open-source at https://github.com/VeryAmazed/digest. The python library is available on Bioconda. Rust bindings are available as a public crate at https://crates.io/crates/digest-rs.