A deep learning framework with hybrid stacked sparse autoencoder for type 2 diabetes prediction

一种基于混合堆叠稀疏自编码器的深度学习框架用于2型糖尿病预测

阅读:2

Abstract

Sparse numerical datasets are dominant in fields such as applied mathematics, astronomy, finance, and healthcare, presenting challenges due to their high dimensionality and sparse distribution. The predominance of zero values complicates optimal feature selection, making data analysis and model performance more complex. To overcome this challenge, this study introduces a deep learning-based algorithm, Hybrid Stacked Sparse Autoencoder (HSSAE), which integrates [Formula: see text] and [Formula: see text] regularization with binary cross-entropy loss to improve feature selection efficiency, where [Formula: see text] regularization penalizes large weights, simplifying data representations, while [Formula: see text] regularization prevents overfitting by limiting the total weight size. Additionally, the dropout technique enhances the algorithm's performance by randomly deactivating neurons during training, avoiding over-reliance on specific features. Meanwhile, batch normalization stabilizes weight distributions, reducing computational complexity and accelerating the convergence. The proposed algorithm, HSSAE, was evaluated against traditional classifiers, including Decision Tree, Random Forest, K-Nearest Neighbors, and Naïve Bayes, as well as deep learning-based models, such as Convolutional Neural Network, Long Short-Term Memory, and Stacked Sparse Autoencoder, in terms of Precision, Recall, Accuracy, F1-score, AUC, and Hamming Loss. Quantitatively, the proposed algorithm, HSSAE, was tested on two different sparse datasets, demonstrating superior performance with the highest accuracy of 89% on the health indicator dataset and 93% on the EHRs diabetes prediction dataset, respectively, and outperforming competing classifiers. The proposed algorithm, HSSAE, extracts features effectively and enhances robustness, making it well-suited for sparse data applications, particularly in healthcare, where high prediction accuracy is crucial.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。