In the modern world, there is a need to provide a better understanding of the importance or relevance of the available descriptive features for predicting target attributes to solve the feature ranking problem. Among the published works, the vast majority are devoted to the problems of feature selection and extraction, and not the problems of their ranking. In this paper, we propose a novel method based on the Bayesian approach that allows us to not only to build a methodically justified way of ranking features on small datasets, but also to methodically solve the problem of benchmarking the results obtained by various ranking algorithms. The proposed method is also model-free, since no restrictions are imposed on the model. We carry out an experimental comparison of our proposed method with the classical frequency method. For this, we use two synthetic datasets and two public medical datasets. As a result, we show that the proposed ranking method has a high level of self-consistency (stability) already at the level of 50 samples, which is greatly improved compared to classical logistic regression and SHAP ranking. All the experiments performed confirm our theoretical conclusions: with the growth of the sample, an increasing trend of mutual consistency is observed, and our method demonstrates at least comparable results, and often results superior to other methods in the values of self-consistency and monotonicity. The proposed method can be applied to a wide class of rankings of influence factors on small samples, including industrial tasks, forensics, psychology, etc.
Feature Ranking on Small Samples: A Bayes-Based Approach.
阅读:7
作者:Vatian Aleksandra, Gusarova Natalia, Tomilov Ivan
| 期刊: | Entropy | 影响因子: | 2.000 |
| 时间: | 2025 | 起止号: | 2025 Jul 22; 27(8):773 |
| doi: | 10.3390/e27080773 | ||
特别声明
1、本文转载旨在传播信息,不代表本网站观点,亦不对其内容的真实性承担责任。
2、其他媒体、网站或个人若从本网站转载使用,必须保留本网站注明的“来源”,并自行承担包括版权在内的相关法律责任。
3、如作者不希望本文被转载,或需洽谈转载稿费等事宜,请及时与本网站联系。
4、此外,如需投稿,也可通过邮箱info@biocloudy.com与我们取得联系。
