DNA-m6A calling and integrated long-read epigenetic and genetic analysis with fibertools.

利用 fibertools 进行 DNA-m6A 调用和整合长读表观遗传学和遗传学分析

阅读:25
Long-read DNA sequencing has recently emerged as a powerful tool for studying both genetic and epigenetic architectures at single-molecule and single-nucleotide resolution. Long-read epigenetic studies encompass both the direct identification of native cytosine methylation and the identification of exogenously placed DNA N (6) -methyladenine (DNA-m6A). However, detecting DNA-m6A modifications using single-molecule sequencing, as well as coprocessing single-molecule genetic and epigenetic architectures, is limited by computational demands and a lack of supporting tools. Here, we introduce fibertools, a state-of-the-art toolkit that features a semisupervised convolutional neural network for fast and accurate identification of m6A-marked bases using Pacific Biosciences (PacBio) single-molecule long-read sequencing, as well as the coprocessing of long-read genetic and epigenetic data produced using either the PacBio or Oxford Nanopore Technologies (ONT) sequencing platforms. We demonstrate accurate DNA-m6A identification (>90% precision and recall) along >20 kb long DNA molecules with an ∼1000-fold improvement in speed. In addition, we demonstrate that fibertools can readily integrate genetic and epigenetic data at single-molecule resolution, including the seamless conversion between molecular and reference coordinate systems, allowing for accurate genetic and epigenetic analyses of long-read data within structurally and somatically variable genomic regions.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。