Development of a diagnostic model for ovarian cancer based on machine learning algorithms and functional analysis of key biomarker SOX17.

基于机器学习算法和关键生物标志物SOX17功能分析的卵巢癌诊断模型的开发。

阅读:3
BACKGROUND: Ovarian cancer (OC) demonstrates the poorest prognosis among gynecological malignancies, with five-year survival rates below 45%, primarily due to late-stage diagnosis. To address this challenge, we systematically identified OC-specific differentially expressed genes (DEGs) to develop a robust diagnostic model based on eleven machine learning algorithms. Furthermore, we explored the potential mechanism of key DEG in OC. METHODS: We acquired RNA sequencing data of 426 tissues (352 °C tumor and 74 adjacent non-tumor) from the Gene Expression Omnibus (GEO) repository. Following rigorous batch effect correction and normalization procedures, DEGs were screened between tumor and non-tumor specimens. Furthermore, the resultant DEGs underwent comprehensive functional characterization, including Gene Ontology (GO) enrichment, Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway, protein-protein interaction (PPI) network, and immune microenvironment analyses. To optimize diagnostic feature selection, we implemented a tiered analytical approach combining F-test, LASSO regression and Pearson correlation. The curated gene subset served as input for developing machine learning classifiers, with the cohort partitioned into stratified training (70%) and validation (30%) subsets. Eleven distinct algorithms were evaluated through iterative 10-fold cross-validation, with model performance quantified via receiver operating characteristic (ROC) analysis, precision-recall (PR) metrics, calibration curve fitting, learning curve profiling, and decision curve analysis (DCA) assessment. Finally, we investigated the biological functions of one key gene, SRY-box containing gene (SOX17) in OC cell lines by in vitro experiments. RESULTS: We delineated 27 DEGs exhibiting distinct expression patterns in OC, with 16 upregulated and 11 downregulated genes. GO enrichment analysis suggested that DEGs were significantly enriched in response to folic acid, blood microparticle and alcohol dehydrogenase [NAD(P)+] activities. KEGG pathway analysis indicated that these DEGs were mainly involved in tyrosine metabolism, fatty acid degradation, ABC transporters and pyruvate metabolism. Immune microenvironment profiling revealed substantial M2 macrophage polarization and cytotoxic T-cell exhaustion in tumor tissues.The optimal diagnostic model was established based on five key genes (CD24, CLEC4M, SOX17, ADH1C and CHRDL1) and Logistic Regression algorithm was the optimal algorithm. The area under the receiver operating characteristic curve (AUC) and accuracy of the model were 0.93 and 0.875, respectively. SOX17 was upregulated in OC tumor tissues and knockdown of SOX17 obviously suppressed tumor cell proliferation and migratory. CONCLUSION: Our multivariable diagnostic model based on five genes through logistic regression optimization, demonstrated robust discriminative capacity for OC. SOX17 functions as a suppressor and potential therapeutic target for OC.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。