insilicoSV: a flexible grammar-based framework for structural variant simulation and placement

insilicoSV:一个灵活的、基于语法的结构变体模拟和定位框架

阅读:1

Abstract

SUMMARY: Structural variants (SVs) are key drivers of genetic variation and disease in the genome. Their discovery remains challenging, however, in large part due to the scarcity of validated SV callsets and comprehensive benchmarks, which are essential for method development and evaluation. The growing number of data-driven learning-based approaches for SV discovery, in particular, requires large, diverse, and well-balanced training datasets to achieve reliable performance. To address this need, SV simulation has served as a key tool for assessing method performance and training SV models. However, existing SV simulators only support a fixed and limited set of SV classes and do not provide fine-grained control over the placement of SVs within specific contexts of the genome. Here we present insilicoSV, a versatile framework for SV simulation, which models SVs using a simple and flexible grammar, allowing users to easily define standard and custom arbitrary genome rearrangements, as well as encode genome placement constraints. This design allows insilicoSV to naturally support new and bespoke SV types, such as the complex rearrangements of cancer genomes. In addition to grammar-based modeling, insilicoSV provides built-in support for 26 predefined SV types, placement of user-provided SVs, small variant simulation, streamlined workflows for the simulation of genome evolution and genome mixtures, read simulation, alignment, and visualization. These features enable the creation of comprehensive genomic datasets for a variety of downstream applications, such as in-depth benchmarking of alignment and variant calling methods, as well as training of data-driven learning-based approaches for SV detection. AVAILABILITY AND IMPLEMENTATION: insilicoSV is available under the MIT license at https://github.com/PopicLab/insilicoSV and https://doi.org/10.5281/zenodo.17402009.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。