TISCalling: leveraging machine learning to identify translational initiation sites in plants and viruses

TISCalling:利用机器学习识别植物和病毒中的翻译起始位点

阅读:2

Abstract

The recognition of translational initiation sites (TISs) offers complementary insights into identifying genes encoding novel proteins or small peptides. Conventional computational methods primarily identify Ribo-seq-supported TISs and lack the capacity of systematic and global identification of TIS, especially for non-AUG sites in plants. Additionally, these methods are often unsuitable for evaluating the importance of mRNA sequence features for TIS determination. In this study, we present TISCalling, a robust framework that combines machine learning (ML) models and statistical analysis to identify and rank novel TISs across eukaryotes. TISCalling generalized and ranks important features common to multiple plant and mammalian species while identifying kingdom-specific features such as mRNA secondary structures and "G"-nucleotide contents. Furthermore, TISCalling achieved high predictive power for identifying novel viral TISs. Importantly, TISCalling provides prediction scores for putative TIS along plant transcripts, enabling prioritization of those of interest for further validation. We offer TISCalling as a command-line-based package [ https://github.com/yenmr/TISCalling ], capable of generating prediction models and identifying key sequence features. Additionally, we provide web tools [ https://predict.southerngenomics.org/TISCalling/ ] for visualizing pre-computed potential TISs, making it accessible to users without programming experience. The TISCalling framework offers a sequence-aware and interpretable approach for decoding genome sequences and exploring functional proteins in plants and viruses.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。