SnoBIRD: a tool to identify C/D box snoRNAs and refine their annotation across all eukaryotes.

SnoBIRD:一种用于识别 C/D 盒 snoRNA 并改进其在所有真核生物中的注释的工具

阅读:4
作者:Fafard-Couture Étienne, Boulanger Cédric, Faucher-Giguère Laurence, Sinagoga Vanessa, Berthoumieux Mélodie, Hedjam Jordan, Marcel Virginie, Durand Sébastien, Bayfield Mark A, Bachand François, Abou Elela Sherif, Jacques Pierre-Étienne, Scott Michelle S
Small nucleolar RNAs (snoRNAs), a group of noncoding RNAs present amongst all eukaryotes, are most extensively characterized for their regulation of ribosome biogenesis and splicing. Despite their central roles, current snoRNA annotations remain incomplete. Several eukaryote genome annotations contain few or no snoRNAs, and none distinguish expressed snoRNAs from their pseudogenes-a recently characterized snoRNA subclass with distinct features and expression levels. To address this, we developed SnoBIRD, a BERT-based C/D box snoRNA predictor trained on snoRNAs spanning all eukaryote kingdoms. We show that SnoBIRD outperforms existing tools and is the only predictor capable of identifying snoRNA pseudogenes using biologically relevant signal. Applied on the fission yeast and human genomes, we demonstrate that only SnoBIRD scales well with genome size in terms of runtime, and we identify and experimentally validate several new SnoBIRD-predicted C/D box snoRNAs. By running SnoBIRD on multiple eukaryote genomes, we identify hundreds of novel snoRNA candidates and highlight SnoBIRD's usefulness to determine the evolutionary paths of snoRNAs distributed across different species. Overall, SnoBIRD represents a user-friendly and efficient tool for reliably predicting C/D box snoRNAs and their pseudogenes across any eukaryote genome.

特别声明

1、本文转载旨在传播信息,不代表本网站观点,亦不对其内容的真实性承担责任。

2、其他媒体、网站或个人若从本网站转载使用,必须保留本网站注明的“来源”,并自行承担包括版权在内的相关法律责任。

3、如作者不希望本文被转载,或需洽谈转载稿费等事宜,请及时与本网站联系。

4、此外,如需投稿,也可通过邮箱info@biocloudy.com与我们取得联系。