Abstract
INTRODUCTION: Sexually transmitted infections (STIs) remain a global health challenge, with 374 million new cases annually. Among men, STIs cause serious complications, increase the risk of HIV infection, and contribute to stigma and economic burdens, especially in low and middle income countries. Traditional models often miss complex risk interactions, while machine learning and deep learning offer stronger predictive tools. This study applied deep learning and traditional machine learning algorithms to predict self-reported STIs and to model predictors of self-reported STIs among men in 54 LMICs to support targeted interventions that aligned with the global STI prevention and control strategies. METHODS: This study used data from Demographics and Health Surveys conducted in 54 LMICs across the globe between 2010 and 2024. A weighted sample of 390,608 sexually active men aged between 15 and 64 years was included in this study. The data were cleaned and weighted by STATA version 17 software and Python version 3.9 software. Descriptive statistics was executed using STATA version 17 software. Python 3.9 software was used for deep learning and traditional machine learning prediction of self-reported STIs. Furthermore, CatBoost, Logistic Regression, Random Forest, LightGBM, AdaBoost, XGBoost, MLP, and TabNet were employed to identify the most influential predictors of self-reported STIs among men. In addition, accuracy, and Area under the curve was used to evaluate the performance of the predictive models. RESULT: The meta-analysis showed that 4.44% of men reported STIs (95% CI: 3.46, 5.43), with substantial variation across nations and regions. The highest prevalence was observed in Liberia (18.66%), Lesotho (15.91%), and Uganda (13.32%), while the lowest were observed in Jordan (0.05%), Kyrgyzstan (0.50%), and Armenia (0.38%). Moreover, the prevalence of self-reported STIs was highest in Sub-Saharan Africa (5.50%), followed by Asia (3.99%), Latin America (1.03%), and lowest in the Middle East & North Africa (0.13%). The prevalence of self-reported STIs also showed heterogeneity by income subgroup among LMICs, with low-income countries showing the highest prevalence (5.53%), lower middle-income countries a moderate prevalence (5.19%), and upper middle-income countries the lowest prevalence of self-reported STIs (1.15%). CatBoost achieved an accuracy of 80.10%, an AUC of 88.80%, a precision of 76.00%, and a recall of 89.00%. The SHAP analysis plot based on the CatBoost model identified country income status, region, age, age at first sex, religion, education, HIV knowledge, wealth index, marital status, and place of residence as the top-ranked predictors of self-reported STIs among men in LMICs. CONCLUSION: Self-reported STIs among men in LMICs remains a significant public health concern, with wide national and regional variation. The highest prevalence was observed in Liberia, Lesotho, and Uganda, while the lowest effects were observed in Jordan, Kyrgyzstan, and Armenia. Moreover, Sub-Saharan Africa was with the highest prevalence of self-reported STIs, followed by Asia, Latin America and the Caribbean, and lowest in the Middle East and North Africa. The prevalence of self-reported STIs also showed a clear heterogeneity by country income status, with low-income countries showing the highest prevalence of self-reported STIs, lower middle-income countries a moderate, and upper middle-income countries showed the lowest prevalence of self-reported STIs among men. CatBoost was identified as a top performing machine learning algorithm to predict self-reported STIs. Country income status, region, age, age at first sex, religion, education, HIV knowledge, wealth index, marital status, and place of residence were identified as the strongest predictors of self-reported STIs. Findings underscore the potential of machine learning, particularly CatBoost, to capture complex determinants of self-reported STIs and to inform targeted STI prevention and control strategies among men living in LMICs.