Abstract
Background: Antimicrobial resistance (AMR) in Neisseria gonorrhoeae is an escalating global health challenge, affecting over 82 million individuals each year. The increasing resistance to commonly used antibiotics such as azithromycin, ciprofloxacin, and cefixime hinders timely and effective treatment, primarily due to the delayed detection of resistant strains. Methods: To overcome these limitations, a hybrid machine learning (ML) and deep learning (DL) framework was developed using a dataset comprising 3786 N. gonorrhoeae isolates. The dataset included clinical metadata and phenotypic resistance profiles. The preprocessing steps involved handling 23% data sparsity, imputing 31 skewed columns, and applying resampling and harmonisation techniques sensitive to data skewness. A predictive pipeline was constructed using both clinical variables and genomic unitigs, and a suite of 33 classifiers was evaluated. Results: The CatBoost model emerged as the top-performing ML algorithm, particularly due to its proficiency in handling categorical data, while a three-layered neural network served as the DL baseline. The ML models outperformed genome-wide association study (GWAS) benchmarks, achieving AUC scores of 0.97 (ciprofloxacin), 0.95 (cefixime), and 0.94 (azithromycin), representing a 4-7% improvement. SHAP analysis identified biologically relevant resistance markers, such as penA mosaic alleles and mtrR promoter mutations, validating the interpretability of the model. Conclusions: The study highlights the potential of ML-driven approaches to enhance the real-time prediction of antimicrobial resistance in N. gonorrhoeae. These methods can significantly contribute to antibiotic stewardship programs, although further validation is required in low-resource settings to confirm their generalisability and robustness across diverse populations.