Abstract
BACKGROUND: Receiver Operating Characteristic (ROC) analysis is commonly used to evaluate how well biomarkers distinguish between diseased and non-diseased individuals. A very important step in this process is choosing an appropriate cut-off value. Although many ROC-based cut-off methods exist, they often lead to different decisions, and guidance on which methods work reliably in practice is limited. OBJECTIVE: To compare commonly used ROC-based cut-off selection methods and identify those that provide stable and balanced diagnostic performance under realistic data conditions. METHODS: We conducted extensive simulation studies using symmetric, skewed, and mixed continuous predictors, varying effect size, variance, and sample size. Multiple ROC-based cut-off methods were evaluated using sensitivity and specificity as the primary performance measures. Findings were further validated using real-world datasets involving blood-based biomarkers. RESULTS: The sensitivity and specificity curves showed that across all simulations and real-world datasets, Youden’s index, accuracy-based methods, Cohen’s kappa, and the product of sensitivity and specificity consistently produced stable and well-balanced sensitivity and specificity. In contrast, methods such as the F1 score, ROC01, odds ratio, and risk ratio often produced highly variable results, favouring one diagnostic measure at the expense of the other. CONCLUSION: Youden’s index and the product of sensitivity and specificity offer the most reliable and practical choices for cut-off selection across a wide range of conditions. While other methods may be useful for specific goals, they should be applied with caution in routine diagnostic and prognostic settings. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12874-026-02828-x.