A hybrid machine learning model combining association rule mining and classification algorithms to predict differentiated thyroid cancer recurrence

一种结合关联规则挖掘和分类算法的混合机器学习模型,用于预测分化型甲状腺癌复发

阅读:1

Abstract

BACKGROUND: Differentiated thyroid cancer (DTC) is the most prevalent endocrine malignancy with a recurrence rate of about 20%, necessitating better predictive methods for patient management. This study aims to create a relational classification model to predict DTC recurrence by integrating clinical, pathological, and follow-up data. METHODS: The balanced dataset comprises 550 DTC samples collected over 15 years, featuring 13 clinicopathological variables. To address the class imbalance in recurrence status, the Synthetic Minority Over-sampling Technique for Nominal and Continuous (SMOTE-NC) was utilized. A hybrid model combining classification algorithms with association rule mining was developed. Two relational classification approaches, regularized class association rules (RCAR) and classification based on association rules (CBAR), were implemented. Binomial logistic regression analyzed independent predictors of recurrence. Model performance was assessed through accuracy, sensitivity, specificity, positive predictive value, negative predictive value, and F1 score. RESULTS: The RCAR model demonstrated superior performance over the CBAR model, achieving accuracy, sensitivity, and F1 score of 96.7%, 93.1%, and 96.7%, respectively. Association rules highlighted that papillary pathology with an incomplete response strongly predicted recurrence. The combination of incomplete response and lymphadenopathy was also a significant predictor. Conversely, the absence of adenopathy and complete response to treatment were linked to freedom from recurrence. Incomplete structural response was identified as a critical predictor of recurrence risk, even with other low-recurrence conditions. CONCLUSION: This study introduces a robust and interpretable predictive model that enhances personalized medicine in thyroid cancer care. The model effectively identifies high-risk individuals, allowing for tailored follow-up strategies that could improve patient outcomes and optimize resource allocation in DTC management.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。