Diagnosing schizophrenia with routine blood tests: a comparative analysis of machine learning algorithms

利用常规血液检测诊断精神分裂症:机器学习算法的比较分析

阅读:2

Abstract

INTRODUCTION: Schizophrenia is a severe mental disorder affecting approximately 1% of the general population, diagnosed primarily using clinical criteria. Due to the lack of objective diagnostic methods and reliable biomarkers, accurate diagnosis and effective treatment remain challenging. Peripheral blood biomarkers have recently attracted attention, and machine learning methods offer promising analytical capabilities to enhance diagnostic accuracy. METHODS: This retrospective, case-control study included 203 schizophrenia patients treated over a five-year period at a tertiary hospital and 192 age- and sex-matched healthy controls. Demographic data and routine hematological and biochemical parameters were extracted from medical records. Variables missing more than 85% of data were excluded; remaining missing values were imputed after train-test splitting to avoid data leakage. Optimal biomarker subsets were selected using Grey Wolf Optimization (GWO). Random Forest (RF), XGBoost, Support Vector Machine (SVM), K-Nearest Neighbor (KNN), and Logistic Regression (LR) models were trained and evaluated via stratified 10-fold cross-validation. RESULTS: Groups were homogeneous in terms of age and sex. Before GWO optimization, XGBoost (95.55%) and Random Forest (94.63%) yielded the highest accuracies. Following optimization, Random Forest accuracy improved (94.95%) with a recall of 96.25%, while XGBoost reached the highest accuracy (95.90%) and strong specificity (95.54%). Post-optimization, Area Under the Curve (AUC) values were highest for XGBoost (0.96) and Random Forest (0.95), indicating strong diagnostic performance. Total protein, glucose, iron, creatine kinase, total bilirubin, uric acid, calcium, and sodium were key biomarkers distinguishing schizophrenia. Interestingly, glucose levels were significantly lower in schizophrenia patients compared to controls, contrary to typical findings. Differences in triglycerides, liver enzymes, sodium, and potassium lacked clear clinical significance. DISCUSSION: The machine learning models developed provided diagnostic accuracy comparable to studies utilizing more expensive biomarkers, highlighting potential clinical and economic advantages. External validation is recommended to further confirm the generalizability and clinical utility of these findings.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。