Abstract
BACKGROUND: In the medical literature, in which time-to-event (such as time to death or disease recurrence) outcomes are commonly studied, hierarchical data is frequently encountered with patients nested within hospitals or regions. Multilevel hierarchical mixed-effects survival models are routinely used in these settings to accommodate the correlation between study subjects belonging to the same cluster and any potential unobserved heterogeneity. However, these analyses usually focus on fixed effects while marginalizing over the random effects, with fully conditional or marginal (on the random effects) post-estimation predictions. METHODS: In this work, we combine regression standardization over the observed covariates with posterior predictions of the random effects to obtain standardized survival probabilities. These predictions quantify how the entire study population would have fared under the performance of each cluster and can be used to obtain fair comparisons between hierarchical units. Compared to other common approaches, such as the median hazard ratio, this proposal yields quantities that are easier to interpret and that, under certain assumptions, can have a causal interpretation. RESULTS: We illustrate the new methodology in practice using data on bladder cancer patients with a three-level hierarchical structure composed of patients nested within surgeons and centers. Then, we illustrate a variety of standardized survival predictions benchmarking, e.g., best/average/worst surgeons and centers, surgeons within a center, or centers directly — all from a single and unified modeling framework. CONCLUSIONS: We introduced an analytical approach that can be used to quantify and fairly compare the differences between hierarchical units using easily interpretable measures such as (standardized) survival probabilities and contrasts thereof. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12874-026-02782-8.