Explainable AI based cervical cancer prediction using FSAE feature engineering and H2O AutoML

基于FSAE特征工程和H2O AutoML的可解释人工智能宫颈癌预测

阅读:1

Abstract

Cervical cancer, predominantly caused by Human Papillomavirus (HPV) infection, remains a significant global health burden for women, contributing to elevated morbidity and mortality rates. Early and accurate prediction is critical in improving patient outcomes and optimizing healthcare resource allocation. While machine learning (ML) and deep learning (DL) methods-such as support vector machines, random forests, and convolutional neural networks-have demonstrated promise in disease prediction, model interpretability, computational efficiency, and rely on large, labeled datasets. Additionally, conventional diagnostic methods like piezoresistive, piezoelectric, and optical lever techniques are often cost-prohibitive and complex, limiting widespread use. This study proposes a hybrid ML framework that integrates H2O AutoML with an autoencoder-based feature extraction and Fisher Score-based feature selection. To enhance model transparency and clinical trust, Local Interpretable Model-Agnostic Explanations (LIME) and SHAP (SHapley Additive exPlanations) are employed. The workflow initiates with exploratory data analysis (EDA) and dimensionality reduction using a stacked autoencoder, followed by selection of the top predictive features via Fisher Score. The refined feature set is used to train multiple models via H2O AutoML, with the best-performing deep learning model selected. On the training dataset, the selected model achieved 95.24% accuracy, an AUC of 98.10, and a log loss of 0.1747. Cross-validation confirms the model's robustness with consistent AUC and log loss values. At the optimal F1 threshold of 0.517, the confusion matrix indicates an error rate of 5.75% for actual negatives and 2.59% for actual positives, leading to an overall error rate of 4.14%. LIME and SHAP are used to interpret predictions at the instance level, providing actionable insights for clinicians. These results demonstrate the effectiveness of combining AutoML with explainable AI and advanced feature engineering to enhance the predictive power and interpretability of cervical cancer risk models, offering a scalable solution for clinical decision support.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。