Abstract
BACKGROUND: Over the last decade, the use of machine learning (ML) techniques in problem modeling and solving has increased significantly, including in kidney transplantation. Numerous studies have used ML to predict outcomes such as delayed graft function (DGF). This study compares various ML models with logistic regression (LR) in predicting DGF, focusing on donor characteristics. AIM: To compare various ML models with LR in predicting DGF, focusing on donor characteristics. METHODS: We analyzed 523 deceased donor kidney transplants performed between 2010 and 2020 across three transplant centers. The dataset included 14 donors, 3 transplants, and 64 recipient features. Four problem types were defined based on variable combinations: Donor-only, donor + transplant, donor + recipient, and donor + transplant + recipient. The dataset comprised 43.5% DGF-positive and 56.5% DGF-negative patients, split into 80% for training and 20% for validation/testing. Six ML models - support vector machine, decision trees, random forest (RF), gradient boost (GB), extreme gradient boost (XGB), and multilayer perceptron - were compared with LR. Hyperparameters were optimized using random search and 10-fold cross-validation. Accuracy was the primary performance metric. RESULTS: The best-performing model for each problem type achieved accuracies of 70% (RF), 70% (RF), 58% (RF), and 61% (XGB) for donor-only, donor + transplant, donor + recipient, and donor + transplant + recipient, respectively. LR achieved accuracies of 57%, 66%, 52% and 66%; however, these models generally showed low sensitivity and high specificity. Across most of them, significant predictors included donor creatinine, age, and mean blood pressure, cold ischemia time (transplant variable), and recipient smoking condition. CONCLUSION: While most ML models outperformed LR, the differences were not substantial. This may be attributed to the small dataset size, which likely contributed to the overall poor performance. We recommend using these complex models with high-quality datasets that include a sufficient number of variables and observations to fully leverage their potential. The key question for future research is determining the dataset size required for ML to become the primary analytic tool for predicting kidney transplant outcomes.