PhageCGRNet: Integrating Chaos Game Representation of Genomes with Convolutional Neural Network for accurate phage host classification prediction

PhageCGRNet:将基因组的混沌博弈表示与卷积神经网络相结合,以实现精确的噬菌体宿主分类预测

阅读:2

Abstract

Phages (or bacteriophages) play a critical role in microbial communities, and accurately predicting the hosts of phages is essential for understanding the dynamics of these viruses and their impact on bacterial populations. In the prediction of classification of phage hosts, feature extraction is a critical step that directly affects the accuracy of the predictions. Among the techniques used for feature extraction, k-mers are the most commonly employed method. Although many methods based on k-mers have been proposed, these methods typically use only the frequency information of k-mers as features. However, when frequencies are identical, the frequency information of these k-mers becomes less useful. To address this limitation, we propose a novel method called PhageCGRNet, which not only utilizes the frequency information of k-mers but also incorporates the positional information of k-mers. In our method, we represent each genome sequence as a three-dimensional matrix containing k-mers frequency features and positional features, and then utilize the Convolutional Neural Network model to predict the host category. Specifically, we combine the frequency information of k-mers directly extracted from the sequences with the positional information of k-mers obtained using the Chaos Game Representation method to construct the feature matrix, which serves as the input to the Convolutional Neural Network. We conducted experiments on two benchmark datasets, and compared PhageCGRNet with existing advanced methods for phage host classification. The experimental results demonstrate that PhageCGRNet achieves higher accuracy at both taxonomy levels of species and genus on these two datasets compared to other state-of-the-art methods.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。