Abstract
BACKGROUND: Survival analysis is widely used to predict time-to-event outcomes, with the Cox regression model being a standard approach. However, machine learning methods such as Random Survival Forests (RSF) can capture complex, non-linear relationships that traditional models may miss. OBJECTIVE: This study compared the predictive performance of RSF and Cox regression in modelling tuberculosis (TB) mortality. METHODS: We conducted a retrospective study of TB patients treated at the East London Central Clinic in South Africa. Patient data included demographic, clinical, and treatment-related variables. Model performance was evaluated using five metrics (C-index, Brier Score, Integrated Brier Score, Integrated Absolute Error, and Integrated Squared Error) along with time-dependent receiver operating characteristic (ROC) curves. Variable importance was assessed to identify key predictors. RESULTS: The RSF model consistently outperformed the Cox model across all evaluation metrics. RSF achieved a higher integrated AUC (0.815 vs. 0.652) and lower prediction error (IBS = 0.235 vs. 0.261). Important predictors of mortality included age, sex, weight, and disease class, with RSF capturing their time-dependent effects more accurately. The cumulative case/dynamic control ROC curve showed the strongest predictive accuracy at 120 days (AUC = 0.856). CONCLUSION: RSF demonstrated superior predictive accuracy compared with Cox regression in modelling TB mortality. Its ability to account for non-linear and time-dependent effects makes it a potentially useful tool for improving risk prediction and guiding patient management in TB care. CLINICAL TRIAL: Not applicable. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12879-026-13109-9.