Read-level genotyping of short tandem repeats using long reads and single-nucleotide variation with STRkit

利用 STRkit 进行基于长读长和单核苷酸变异的短串联重复序列读段水平基因分型

阅读:1

Abstract

Variation in short tandem repeats (STRs) is implicated in Mendelian disease and complex traits but can be difficult to resolve with short-read genome sequencing. We present STRkit, a software package for genotyping STRs using long-read sequencing (LRS) that uses proximate single-nucleotide variants to improve genotyping accuracy without a priori haplotype information. We show that STRkit has unique strengths versus other methods: It can use data from both major LRS technologies (Pacific Biosciences HiFi [PacBio] and Oxford Nanopore Technologies [ONT]) to output both allele- and read-level copy number and sequence; it performs best in benchmarking with F1 scores of 0.9631 and 0.9544 with PacBio and ONT data, respectively; it achieves higher rates of Mendelian consistency than other genotyping tools; and it is open source software. STRkit's features open up new possibilities for association testing, assessing patterns of STR inheritance and better understanding the functional effects of these notable repeat elements.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。