Suboptimal capability of individual machine learning algorithms in modeling small-scale imbalanced clinical data of local hospital

单个机器学习算法在对本地医院小规模不平衡临床数据进行建模方面的能力不足

阅读:1

Abstract

In recent years, artificial intelligence (AI) has shown promising applications in various scientific domains, including biochemical analysis research. However, the effectiveness of AI in modeling small-scale, imbalanced datasets remains an open question in such fields. This study explores the capabilities of eight basic AI algorithms, including ridge regression, logistic regression, random forest regression, and others, in modeling a small, imbalanced clinical dataset (total n = 387, class 0 = 27, class 1 = 360) related to the records of the biochemical blood tests from the patients with multiple wasp stings (MWS). Through rigorous evaluation using k-fold cross-validation and comprehensive scoring, we found that none of the models could effectively model the data. Even after fine-tuning the hyperparameters of the best-performing models, the results remained below acceptable thresholds. The study highlights the challenges of applying AI to small-scale datasets with imbalanced groups in biochemical or clinical research and emphasizes the need for novel algorithms tailored to small-scale data. The findings also call for further exploration into techniques such as transfer learning and data augmentation, and they underline the importance of understanding the minimum dataset scale required for effective AI modeling in biochemical contexts.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。