Abstract
INTRODUCTION: Antibiotic resistance is emerging as a critical global public health threat. The precise prediction of bacterial antibiotic resistance genes (ARGs) and phenotypes is essential to understand resistance mechanisms and guide clinical antibiotic use. Although high-throughput DNA sequencing provides a foundation for identification, current methods lack precision and often require manual intervention. METHODS: We developed a novel deep learning model for ARG prediction by integrating bacterial protein sequences using two protein language models, ProtBert-BFD and ESM-1b. The model further employs data augmentation techniques and Long Short-Term Memory (LSTM) networks to enhance feature extraction and classification performance. RESULTS: The proposed model demonstrated superior performance compared to existing methods, achieving higher accuracy, precision, recall, and F1-score. It significantly reduced both false negative and false positive predictions in identifying ARGs, providing a robust computational tool for reliable gene-level resistance detection. Moreover, the model was successfully applied to predict bacterial resistance phenotypes, demonstrating its potential for clinical applicability. DISCUSSION: This study presents an accurate and automated approach for predicting antibiotic resistance genes and phenotypes, reducing the need for manual verification. The model offers a powerful technical tool that can support clinical decision-making and guide antibiotic use, thereby addressing an urgent need in the fight against antimicrobial resistance.