Abstract
The unavailability of accurate and reliable methods for early ovarian cancer detection represents a major gap in ovarian cancer diagnosis and management. The emergence and recent integration of machine learning with cancer diagnostic techniques, particularly biomarker-based blood tests, have the potential to improve the selectivity and sensitivity of ovarian cancer detection substantially. Herein, we leverage a series of machine learning and statistical approaches to analyze clinically relevant data sets of more than 300 patients with ovarian tumors and 47 blood-obtained features to distinguish between cancerous and benign tumors. We found that HE4, CA125, menopausal status, and age were some of the most important features distinguishing cancerous from benign ovarian tumors in all patient populations. Age was noted to be a critical feature with cancer discriminatory power only in premenopausal patients but less so in postmenopausal patients. Systematic consideration of patient menopausal status, types of machine learning algorithms, and number of clinical features is necessary prior to ovarian cancer screening to yield more accurate and reliable diagnostic results. Overall, this study provides deeper insight into the use of machine learning, feature selection, and other relevant quantitative approaches to advance ovarian cancer diagnosis to improve patient outcomes.