Nanopore guided annotation of transcriptome architectures

纳米孔引导的转录组结构注释

阅读:4
作者:Jonathan S Abebe, Yasmine Alwie #, Erik Fuhrmann #, Jonas Leins, Julia Mai, Ruth Verstraten, Sabrina Schreiner, Angus C Wilson, Daniel P Depledge

Abstract

Nanopore direct RNA sequencing (DRS) enables the capture and full-length sequencing of native RNAs, without recoding or amplification bias. Resulting data sets may be interrogated to define the identity and location of chemically modified ribonucleotides, as well as the length of poly(A) tails, on individual RNA molecules. The success of these analyses is highly dependent on the provision of high-resolution transcriptome annotations in combination with workflows that minimize misalignments and other analysis artifacts. Existing software solutions for generating high-resolution transcriptome annotations are poorly suited to small gene-dense genomes of viruses due to the challenge of identifying distinct transcript isoforms where alternative splicing and overlapping RNAs are prevalent. To resolve this, we identified key characteristics of DRS data sets that inform resulting read alignments and developed the nanopore guided annotation of transcriptome architectures (NAGATA) software package (https://github.com/DepledgeLab/NAGATA). We demonstrate, using a combination of synthetic and original DRS data sets derived from adenoviruses, herpesviruses, coronaviruses, and human cells, that NAGATA outperforms existing transcriptome annotation software and yields a consistently high level of precision and recall when reconstructing both gene sparse and gene-dense transcriptomes. Finally, we apply NAGATA to generate the first high-resolution transcriptome annotation of the neglected pathogen human adenovirus type F41 (HAdV-41) for which we identify 77 distinct transcripts encoding at least 23 different proteins. Importance: The transcriptome of an organism denotes the full repertoire of encoded RNAs that may be expressed. This is critical to understanding the biology of an organism and for accurate transcriptomic and epitranscriptomic-based analyses. Annotating transcriptomes remains a complex task, particularly in small gene-dense organisms such as viruses which maximize their coding capacity through overlapping RNAs. To resolve this, we have developed a new software nanopore guided annotation of transcriptome architectures (NAGATA) which utilizes nanopore direct RNA sequencing (DRS) datasets to rapidly produce high-resolution transcriptome annotations for diverse viruses and other organisms.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。