Classification of virulence factors based on dual-channel neural networks with pre-trained language models

基于预训练语言模型的双通道神经网络毒力因子分类

阅读:1

Abstract

Virulence factors (VFs) are crucial molecules that enable pathogens to cause infection and disease in a host. They allow pathogens to evade the host's immune defenses and facilitate the progression of infection through various mechanisms. With the increasing prevalence of antibiotic-resistant strains and the emergence of new and re-emerging infectious agents, the classification of VFs has become more critical. This study presents PLM-GNN, an innovative dual-channel model designed for precise classification of VFs, focusing on the seven most numerous types. It integrates a structure channel, which employs a geometric graph neural network to capture the three-dimensional structure features of VFs, and a sequence channel that utilizes a pre-trained language model with Convolutional Neural Network (CNN) and Transformer architectures to extract local and global features from VF sequences, respectively. On the independent test set, the method achieved an accuracy of 86.47%, an F1 score of 86.20% and an Area Under the Receiver Operating Characteristic Curve (AUC) of 97.20%, validating its effectiveness. In conclusion, PLM-GNN can precisely classify the seven major VFs, offering a novel approach for studying their functions.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。