Seqwin: Ultrafast identification of signature sequences in microbial genomes

Seqwin:微生物基因组中特征序列的超快速识别

阅读:3

Abstract

MOTIVATION: Polymerase chain reaction (PCR) enables rapid, cost-effective diagnostics but requires prior identification of genomic regions that allow sensitive and specific identification of target microbial groups, herein referred to as microbial signature sequences. We introduce Seqwin, an open-source framework designed to automate microbial genome signature discovery. Tens of thousands of microbial genomes are now available, limiting the application of existing manual and automated approaches for identifying signatures. Modern approaches that are capable of leveraging all available microbial genomes will ensure sensitive and accurate DNA signatures identification and enable robust pathogen detection for clinical, environmental, and public health applications. RESULTS: Seqwin builds weighted pan-genome minimizer graphs and uses a traversal algorithm to identify signature sequences that occur frequently in target genomes but remain rare in non-targets. Unlike earlier tools that depend on strict presence or absence of sequences, Seqwin accommodates natural sequence variation and scales to very large genome collections. When applied to genomes from C. difficile, M. tuberculosis and S. enterica, Seqwin recovered more high-quality signatures than alternative methods with lower computational burden. Seqwin analysis of nearly 15,000 S. enterica genomes yielded over 200 candidate signatures in less than 10 minutes. Seqwin provides an open-source solution for the long-standing need for scalable microbial signature discovery and diagnostic assay design. AVAILABILITY: Seqwin is freely available for academic use (https://github.com/treangenlab/Seqwin) and can be installed via Bioconda.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。