Machine learning in early screening for high-grade cervical intraepithelial neoplasia using blood testing

利用血液检测进行宫颈上皮内瘤变早期筛查的机器学习

阅读:1

Abstract

BACKGROUND: High-grade cervical intraepithelial neoplasia (CIN2/3) is a critical precursor to cervical cancer, yet current screening methods (e.g., HPV testing, colposcopy) face challenges in accessibility and invasiveness, especially in resource-limited settings. We aimed to develop a non-invasive, machine learning (ML)-based model using routine blood biomarkers. This model is intended to assess the risk of high-grade CIN and potentially serve as a triage tool before colposcopy. METHODS: Data were collected from two groups: 128 high-grade CIN (CIN2/3) and 120 low-grade CIN (CIN1) patients. A total of 29 clinical characteristics and blood test measurements were considered for use in model development. Four feature selection algorithms (F-test, LASSO regression, decision tree, and random forest) were used to identify key predictors, and 11 machine learning algorithms were employed for model training. The dataset was split into training (70%) and testing (30%) cohorts. Model performance was evaluated using learning curves, receiver operating characteristic curves (ROC), area under the curve (AUC), Brier score, calibration curves, Precision-Recall (PR) curves, and Decision Curve Analysis (DCA). A web-based calculator was developed for clinical deployment. We assessed feature importance using the SHapley Additive exPlanation (SHAP) approach. RESULTS: Key features selected for the model included creatinine (CREA), red blood cell count (RBC), neutrophil ratio (NEU%), direct bilirubin (DBIL), and monocyte count (MON). The Support Vector Machine (SVM) algorithm achieved the best predictive performance, with an AUC of 0.75 (95% CI: 0.69–0.80) and a Brier score of 0.21 (95% CI: 0.17–0.28). By employing the SHAP method, we identified the variables that contributed to the model. The web tool (https://dvhl6xsf29zmdewixjx7kz.streamlit.app) provides real-time risk stratification. CONCLUSIONS: The model demonstrated strong performance across various validation metrics, with the SVM algorithm achieving an AUC of 0.75, indicating potential clinical utility. We also developed a web-based calculator to estimate high-grade CIN. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12911-025-03321-z.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。