Essential gene prediction in Drosophila melanogaster using machine learning approaches based on sequence and functional features.

阅读:3
作者:Aromolaran Olufemi, Beder Thomas, Oswald Marcus, Oyelade Jelili, Adebiyi Ezekiel, Koenig Rainer
Genes are termed to be essential if their loss of function compromises viability or results in profound loss of fitness. On the genome scale, these genes can be determined experimentally employing RNAi or knockout screens, but this is very resource intensive. Computational methods for essential gene prediction can overcome this drawback, particularly when intrinsic (e.g. from the protein sequence) as well as extrinsic features (e.g. from transcription profiles) are considered. In this work, we employed machine learning to predict essential genes in Drosophila melanogaster. A total of 27,340 features were generated based on a large variety of different aspects comprising nucleotide and protein sequences, gene networks, protein-protein interactions, evolutionary conservation and functional annotations. Employing cross-validation, we obtained an excellent prediction performance. The best model achieved in D. melanogaster a ROC-AUC of 0.90, a PR-AUC of 0.30 and a F1 score of 0.34. Our approach considerably outperformed a benchmark method in which only features derived from the protein sequences were used (P < 0.001). Investigating which features contributed to this success, we found all categories of features, most prominently network topological, functional and sequence-based features. To evaluate our approach we performed the same workflow for essential gene prediction in human and achieved an ROC-AUC = 0.97, PR-AUC = 0.73, and F1 = 0.64. In summary, this study shows that using our well-elaborated assembly of features covering a broad range of intrinsic and extrinsic gene and protein features enabled intelligent systems to predict well the essentiality of genes in an organism.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。