tidk: a toolkit to rapidly identify telomeric repeats from genomic datasets

tidk:一个用于从基因组数据集中快速识别端粒重复序列的工具包

阅读:1

Abstract

SUMMARY: "tidk" (short for telomere identification toolkit) uses a simple, fast algorithm to scan long DNA reads for the presence of short tandemly repeated DNA in runs, and to aggregate them based on canonical DNA string representation. These are telomeric repeat candidates. Our algorithm is shown to be accurate in genomes for which the telomeric repeat unit is known and is tested across a wide variety of newly assembled genomes to uncover new telomeric repeat units. Tools are provided to identify telomeric repeats de novo, scan genomes for known telomeric repeats, and to visualize telomeric repeats on the assembly. "tidk" is implemented in Rust and is available as a command line tool which can be compiled using the Rust toolchain or downloaded as a binary from bioconda. AVAILABILITY AND IMPLEMENTATION: The "tidk" Rust crate is freely available under the MIT license (https://crates.io/crates/tidk), and the source code is available at https://github.com/tolkit/telomeric-identifier.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。