Abstract
BACKGROUND: Tuberculous pericardial effusion (TPE) presents significant diagnostic challenges due to its nonspecific clinical presentation and similarities with other types of pericardial effusion. The available data on the use of artificial intelligence for predicting TPE is minimal and needs further expansion. This study aimed to evaluate the diagnostic performance of various machine learning algorithms (MLAs) in identifying TPE among patients with pericardial effusion. MATERIALS AND METHODS: A retrospective study was conducted at Cho Ray Hospital in Vietnam from 2010 to 2020. Eight MLAs-logistic regression, K-nearest neighbor, support vector machine, random forest, Lagrangian support vector machine, random tree (RT), chi-square automatic interaction detection, and C5.0-were evaluated for their diagnostic accuracy. The performance metrics included sensitivity, specificity, positive predictive value, negative predictive value, positive likelihood ratio, negative likelihood ratio, and accuracy. RESULTS: Of the 248 patients with pericardial effusion, 52 were confirmed to have tuberculosis. Predictive factors for TPE included male sex, a lower body mass index, and fever at admission. The RT model demonstrated the highest accuracy (94%) and area under the curve (AUC) (0.971). Pericardial fluid adenosine deaminase was identified as the most significant feature for TPE diagnosis, with an optimal threshold of 27.8 U/L, a sensitivity of 80.8% and a specificity of 84.2%. CONCLUSION: Machine learning algorithms, particularly the random tree model, demonstrate promising potential for improving TPE diagnosis through noninvasive data analysis. However, successful implementation requires external validation and careful consideration of local healthcare capabilities.