Abstract
MOTIVATION: The American College of Medical Genetics and Genomics/Association for Molecular Pathology (ACMG/AMP) guidelines represent the gold standard for clinical variant interpretation. Despite the widespread adoption of ACMG/AMP guidelines, a comprehensive comparison of the software tools designed to implement them has been lacking. This represents a significant gap, as clinicians require evidence-based guidance on which tools to use in their practice. RESULTS: We benchmarked four ACMG/AMP-based tools (Franklin, InterVar, TAPES, Genebe) selected from 22 tools, and compared their performance with LIRICAL, a top-performing phenotype-driven tool, using 151 expert-curated datasets from Mendelian disorders. Selection criteria included free availability, VCF compatibility, operational reliability, and not being disease-specific. Our evaluation framework assessed top-N accuracy (N = 1, 5, 10, 20, 50), retention rates, precision, recall, F1 scores, and area under the curve (AUC). Statistical validation employed bootstrap confidence intervals (n = 1000) and Friedman tests. LIRICAL (68.21%) and Franklin (61.59%) demonstrated superior top-10 variant prioritization accuracy in Mendelian disorders, significantly outperforming other tools (P = .0000). Results demonstrate that tools with advanced phenotypic integration significantly outperform those relying primarily on genomic features. AVAILABILITY AND IMPLEMENTATION: All data and source code required to reproduce the findings of this study are openly available in the Code Ocean repository at https://doi.org/10.24433/CO.6562438.v1.