Abstract
This study was to comprehensively investigate the epidemiology of nocardiosis worldwide and develop an interpretable machine learning (ML) model to predict mortality in patients with nocardiosis. The PubMed and Web of Science databases were searched for the literature review using the keywords: "Nocardia" or "nocardiosis" through 31 August 2024, 9,750 cases of nocardiosis were reported. Nine ML algorithms were employed to predict the mortality in patients with nocardiosis. A total of 9,750 reported cases were identified and included. Most cases were from North America and Asia. The mean age of patients was 50.4 ± 19.3, with a male predominance (64.7%). The overall all-cause mortality rate was 19.8%, although disseminated infections were associated with a higher mortality rate of 31.7%. Since 2000, the number of reported nocardiosis cases has increased markedly, while the all-cause mortality rate has decreased significantly and stabilized. The distribution of Nocardia species exhibited regional variation. Advanced age, male, underlying diseases, disseminated infections, infection type, clinical features, and use of corticosteroids or immunosuppressants had a higher risk of all-cause mortality [Odds Ratio (ORs) = 1.35-2.63, P < 0.05]. The stochastic gradient boosting (SGBT) model outperformed eight other machine learning models, accurately predicting mortality in patients with nocardiosis across both training and test datasets. This study provides a comprehensive overview of the global epidemiology and species distribution of nocardiosis, highlighting distinct regional patterns. An interpretable ML model was developed and validated that helps clinicians identify high-risk patients early and provides a basis for developing personalized treatment plans.