Abstract
Whole-genome sequencing (WGS) data are an invaluable resource for understanding antimicrobial resistance (AMR) mechanisms. However, WGS data are high-dimensional and the lack of standardized genomic representations is a key barrier to AMR phenotype prediction. To fully explore these high-resolution data, we propose AMR-GNN, a graph deep learning-based framework that integrates multiple genomic representations with graph neural networks (GNN) to enable AMR phenotype prediction from genomic sequence data. We test AMR-GNN with Pseudomonas aeruginosa, a clinically relevant Gram-negative bacterial pathogen known for its complex AMR mechanisms. We present AMR-GNN as a proof-of-concept framework designed to address several key problems in AMR phenotype prediction with data-driven machine learning (ML) approaches, including using multiple genomic representations to enhance performance, to mitigate the influence of clonal relationships and to identify informative biomarkers to provide explainability. Follow-up validation on the largest publicly available dataset spanning both Gram-negative and Gram-positive pathogens highlights AMR-GNN's broad applicability in detecting AMR in diverse and clinically relevant pathogen-drug combinations.