Abstract
BACKGROUND: Standard setting is a critical component of assessment in health professions education, ensuring fairness, defensibility, and validity in determining minimum competence. Among criterion-referenced approaches, the Angoff method and its variants are widely applied. However, the evidence base on their comparative performance remains fragmented, with no recent quantitative synthesis available. This systematic review and meta-analysis aimed to evaluate the outcomes of Angoff methods across different educational and assessment contexts. METHODS: A systematic electronic search was conducted for studies that applied Angoff or its variants [Modified, Yes/No, Three-level, Group (Yes/No), or Mastery] in health professions assessments and reported at least one of the following: cut score, pass rate, or inter-rater reliability. Pooled estimates were calculated using generalized linear mixed models and random-effects meta-analysis. Odds ratio (OR) was used as the effect estimate for pass rate in comparative studies. Meta-regression analyses explored heterogeneity by number of judges, and test length. RESULTS: A total of 91 studies were included in the systematic review. Mixed treatment comparison pooled estimates revealed significantly higher pass rates with Angoff (OR: 7.48) compared to the conventional fixed method. Mastery Angoff (88.2%) and Mastery Angoff with reality check (86.9%) were associated with higher cut scores, resulting in substantially lower pass rates (61.44% for Mastery Angoff). The Modified Angoff with reality check achieved excellent reliability (r = 0.917) while the Angoff (Yes/No) yielded the weakest and most variable results (r = 0.536), a finding confirmed by bootstrap analyses. Meta-regression indicated that each additional judge was associated with a 0.19-percentage point increase in cut scores (p = 0.003) while each additional test item was associated with a 0.05-percentage point increase in the pass rate (p = 0.001). CONCLUSION: For educators and policymakers, these findings underscore that the Angoff method is not a single technique but a flexible family of approaches with distinct outcomes. The choice of variants allows for the deliberate setting of more lenient or stringent standards, with Modified Angoff incorporating a reality check offering the most reliable and defensible results. This flexibility is a key strength but also a critical limitation, as an uninformed choice can lead to unintended pass rates. Therefore, selecting a specific Angoff variant must be a conscious decision aligned with the assessment's purpose and context.