Abstract
BACKGROUND: Copy Number Variation (CNV) is associated with numerous complex diseases, yet their pathogenicity assessment remains heavily dependent on manual expert review, a process that is cumbersome and time-consuming. Although a variety of automated tools for CNV pathogenicity prediction have been developed, systematic assessments of their performance and clinical generalizability under the five-category classification framework defined by ACMG standards remain scarce. METHODS: This study constructed an expert-reviewed CNV benchmark dataset. Under the ACMG five-category system, it systematically compared the classification prediction capabilities of five prediction tools across both deletion and duplication CNV samples. Furthermore, it assessed their applicability in real clinical samples and evaluated potential biases towards specific variations. RESULTS: The study indicates that different tools exhibit performance variations and specialised strengths across different classification tasks, yet overall capability in distinguishing clinical risk boundaries across the five categories remains inadequate. Among these, AnnotSV demonstrated outstanding performance in predicting pathogenic samples, achieving accuracy rates of 95.16% and 86.39% on the deletion and duplication benchmark datasets respectively. ClassifyCNV demonstrated optimal performance in variant of uncertain significance classification with an accuracy of 95.33%. Furthermore, all models exhibited significantly superior five-category discrimination capabilities for deletion samples compared with duplication samples, revealing pronounced performance biases across variant types. CONCLUSION: Current automated CNV prediction tools demonstrate higher discriminatory capacity for deletion variants than for duplications; however, they remain insufficient to fully replace manual expert review within the ACMG five-category framework. Further optimisation using multicentre, consistently curated clinical gold-standard datasets is required to improve their reliability and clinical applicability.