A Population-Specific Ensemble Machine Learning Model for Predicting Borderline or Malignancy Risk of Ovarian Masses in Macao: A Multicenter Retrospective Study

基于人群特异性集成机器学习模型预测澳门卵巢肿块的临界性或恶性风险:一项多中心回顾性研究

阅读:2

Abstract

BACKGROUND: Preoperative discrimination between benign and malignant ovarian tumors is important. The applicability of published prediction tools may be limited across different health systems. We aim to develop a machine learning model specifically for Macao's population to predict the borderline or malignancy risk of ovarian masses using routinely available clinical data in Macao's health system. METHODS: The study cohorts were derived from 2 major hospitals in Macao, including 496 patients who underwent oophorectomy or cystectomy for ovarian masses at CHCSJ between January 2014 and December 2023, along with a simulated prospective cohort of 95 patients from CHCSJ between January 2024 and November 2024, and an external validation cohort of 61 patients from KWH between January 2020 and September 2024. Patients' clinical information, ultrasound features, and laboratory test results before initial treatment were collected. LASSO regression was used for feature selection, and classifiers were developed using various machine learning algorithms. The predictions were compared with postoperative pathological diagnoses. The predictive performance was also compared with the RMI-4. RESULTS: Age, menopausal status, 5 ultrasound features, and 7 laboratory tests were identified as predictors of borderline and malignant ovarian tumors. An ensemble learning model based on a voting classifier was selected as the final model. Our model outperformed RMI-4 in the internal test set, simulated prospective cohort, and external validation cohort, achieving an area under the curve (AUC) of 0.923-0.951 (vs 0.810-0.868, P < .05). Decision curve analysis demonstrated superior clinical utility, and SHAP analysis confirmed its interpretability. CONCLUSIONS: We propose a machine learning model targeting Macao's population for predicting the borderline or malignancy risk of ovarian masses. Our model is accurate, low-cost, easily accessible, and interpretable. On the basis of no workflow changes, machine learning techniques can maximize the predictive potential of routinely available clinical data in a specific health system.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。