Abstract
OBJECTIVE: This study utilizes real-world data from primary membranous nephropathy (PMN) patients to preliminarily develop a venous thromboembolism (VTE) risk prediction model with machine learning. The aim is to improve the rational use of prophylactic anticoagulant therapy by predicting VTE risk in these patients. METHODS: We collected diagnostic and treatment data for PMN patients hospitalized at Sichuan Provincial People's Hospital from 1 January 2018, to 30 September 2024. The data was divided into training and test sets at an 8:2 ratio, followed by processed using combinations of three imputation methods, three sampling methods, and three feature selection methods. After preprocessing, fourteen machine learning algorithms were employed to develop a predictive model for VTE risk in PMN patients. The SHapley Additive exPlanation (SHAP) method was used to interpret the contribution of outcome features. Finally, a VTE risk prediction tool for PMN patients was constructed using Streamlit. RESULTS: A total of 643 patients with PMN were included in the study, of whom 93 developed VTE. Among the 504 models constructed, the NGBoost model, which incorporated imputation by K-Nearest Neighbor, sampling by Borderline-SMOTE, and feature selection by Frequency-based Selection, was identified as the optimal model, achieving an area under the curve (AUC) of 0.911. The optimal model included ten features: D-dimer (DD), Fibrin Degradation Products (FDP)>5 mg/L, international normalized ratio (INR) of prothrombin, Recurrent nephrotic syndrome (RNS), cholinesterase (CHE), Urinary Microalbumin to Creatinine Ratio (umALB/Ucr), statins, antithrombin III (AT III) activity, albumin, and anti-phospholipase A2 receptor antibody (aPLA2Rab). Finally, an online predictive tool based on the optimal model was developed to provide real-time individualized VTE risk predictions for PMN patients. CONCLUSION: This study developed a personalized risk prediction model for VTE in PMN patients using machine learning techniques. Additionally, a web-based tool for this predictive model was created. The model demonstrates strong predictive performance and can assist in clinical decision-making for the prevention and treatment of VTE in PMN patients.