Abstract
BACKGROUND: Identifying dysmorphic red blood cells (RBCs) is critical for diagnosing glomerular diseases, as distinguishing glomerular from non-glomerular hematuria may reduce reliance on invasive diagnostics such as kidney biopsy. We aimed to enhance urinary RBC morphological classification by employing machine learning (ML) to analyze UF-5000 scattergram data. METHODS: RBCs in urine samples (N=185) were classified as dysmorphic or isomorphic based on microscopic findings. UF-5000 scattergrams were quantified to generate 20 statistical features and used to train a ML model in DataRobot (v9.1) with an automated pipeline, five-fold cross-validation, and LogLoss-based selection. Performance was evaluated in an independent cohort (N=1,093). Accuracy was defined as concordance with microscopy findings. Areas under ROC curves (AUROCs) and diagnostic metrics are reported with 95% confidence intervals (CIs). RESULTS: Among conventional UF-5000 parameters, the small RBC/total RBC ratio was the strongest predictor (AUROC 0.97, 95% CI 0.94-0.99). Scattergram-derived features indicated that RBC size-related parameters were crucial for identifying dysmorphic RBCs. The ML model alone demonstrated superior accuracy over UF-5000 RBC-Info alone (concordance 95.2% vs. 92.1%; AUROC 0.95 [0.94-0.97] vs. 0.92 [0.91-0.94]). Logical (OR/AND) combinations of the ML model with RBC-Info outperformed RBC-Info alone (OR: concordance 92.7%, AUROC 0.93 [0.92-0.95]; AND: concordance 94.6%, AUROC 0.94 [0.93-0.96]). CONCLUSIONS: A scattergram-based ML model improves the accuracy and reliability of urinary RBC morphological classification based on UF-5000 scattergrams and may help reduce reliance on invasive diagnostics. Prospective, multicenter studies should validate generalizability and assess integration into routine workflows.