A Unified Framework for Statistical Inference and Power Analysis of Single and Comparative Fβ Scores

用于统计推断和单次及多次Fβ分数功效分析的统一框架

阅读:3

Abstract

Machine learning and artificial intelligence are increasingly applied to medical diagnostics and clinical decision-making. To evaluate model performance, the F1 score and its generalized form, the Fβ score, are widely used as they balance precision and sensitivity. However, rigorous statistical inference and power analysis for the F1 and Fβ scores remain limited. In this study, we propose psF1, a unified and comprehensive framework for interval estimation, hypothesis testing, and power and sample size calculation for both single and comparative F1 and Fβ scores. psF1 leverages exact probability distributions as well as approximations for large sample sizes to provide valid statistical inference and power analyses. Extensive simulations demonstrate the accuracy and robustness of psF1 across a range of sensitivity, precision, and sample size scenarios. We further showcase its practical utility through real-world biomedical classification tasks. This framework enables principled evaluation and comparison of classifiers using F1 and Fβ scores with reliable uncertainty quantification and informed sample size planning. psF1 is freely available at http://github.com/cyhsuTN/psF1.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。