A machine learning approach for non-invasive PCOS diagnosis from ultrasound and clinical features

基于超声和临床特征的机器学习方法用于无创多囊卵巢综合征诊断

阅读:1

Abstract

This study investigates the use of machine learning (ML) algorithms to support faster and more accurate diagnosis of polycystic ovary syndrome (PCOS), with a focus on both predictive performance and clinical applicability. Multiple algorithms were evaluated-including Artificial Neural Networks (ANN), Support Vector Machines (SVM), Logistic Regression (LR), K-Nearest Neighbors (KNN), and Extreme Gradient Boosting (XGBoost). XGBoost consistently outperformed the other models and was selected for final development and validation. To align with the Rotterdam criteria, the dataset was structured into three feature categories: clinical, biochemical, and ultrasound (USG) data. The study explored various combinations of these feature subsets to identify the most efficient diagnostic pathways. Feature selection using the chi-square-based SelectKBest method revealed the top 10 predictive features, which were further validated through XGBoost's internal feature importance, SHAP analysis, and expert clinical assessment. The final XGBoost model demonstrated robust performance across multiple feature combinations: • Clinical + USG + AMH: AUC = 0.9947, Precision = 0.9553, F1 Score = 0.9553, Accuracy = 0.9553. • Clinical + USG: AUC = 0.9852, Precision = 0.9583, F1 Score = 0.9388, Accuracy = 0.9384. The most influential features included follicle count on both ovaries, weight gain, Anti-Müllerian Hormone (AMH), hair growth, menstrual irregularity, fast food consumption, pimples, and hair loss, levels. External validation was performed using a publicly available dataset containing 320 instances and 18 diagnostic features. The XGBoost model trained on the top-ranked features achieved perfect performance on the test set (AUC = 1.0, Precision = 1.0, F1 Score = 1.0, Accuracy = 1.0), though further validation is necessary to rule out overfitting or data leakage. These findings suggest that combining clinical and ultrasound features enables highly accurate, non-invasive, and cost-effective PCOS diagnosis. This study demonstrates the potential of ML-driven tools to streamline clinical workflows, reduce reliance on invasive diagnostics, and support early intervention in women's health.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。