Abstract
Diagnostic models are typically evaluated by assessing their calibration and discrimination; however, neither criterion assesses the practical consequences of using a model. Decision Curve Analysis (DCA) is a method for measuring clinical utility for binary outcome models over a range of risk thresholds. While the utility of polytomous outcome models can be assessed by applying DCA to different dichotomizations of their categories, no method exists to synthesize the binary measures into a single value. This paper illustrates DCA for polytomous outcomes and extends its concepts to develop a summary utility measure for polytomous outcome models. We apply this method to three ordinal logistic regression models, including the NIRUDAK and DHAKA models for predicting dehydration severity in patients over and under five years of age, respectively. Combining the concepts of Standardized Net Benefit (sNB) and Weighted Area Under the Net Benefit Curve, we propose the Weighted Area Under the sNB Curve (wAUCsNB) , which can be determined for every dichotomization of a polytomous outcome. Next, we propose an average of wAUCsNBs weighted by the relative clinical importance of each dichotomized outcome. We term these weights importance weights and define this new measure as the Integrated Weighted Area Under the sNB Curve (IwAUCsNB) . We apply binary DCA to the dehydration models, discuss its limitations, and apply the Integrated wAUCsNB to evaluate the average utility of each model. Finally, we compare these models to criteria from the World Health Organization (WHO) and observe how the results vary for different distributional assumptions of the risk thresholds. Applied to the NIRUDAK, DHAKA, and WHO models, the Integrated wAUCsNB demonstrated that both the DHAKA and NIRUDAK models could classify individuals as benefiting from treatment better than the WHO algorithms and either of the reference strategies of treating everyone or no one.