Abstract
BACKGROUND: Under pressures of health insurance cost containment and high-quality development, diagnosis-related groups (DRGs) and other case-based payment schemes have become a major direction of China’s inpatient payment reform, yet their impacts may extend beyond changes in mean costs. Using large-scale inpatient settlement data, we evaluate the cost-containment effects of DRG payment reform from a distributional perspective, testing whether the reform is reflected not only in a decline in mean standardized inpatient spending but also in reshaping spending variability and heavy tails (tail risk). METHODS: The study covers six cities in China’s Province S from January 2019 to December 2021; three cities launched DRG reform in November 2020, January 2021, and June 2021, respectively. The individual-level sample includes 6,882,875 admissions from 172 healthcare institutions and is aggregated into a hospital–month panel; we construct individual standardized inpatient spending as “spending per admission divided by DRG weight,” and then compute the hospital–month mean, standard deviation, skewness, and kurtosis of standardized spending. Our identification strategy applies a two-stage difference-in-differences (DID) estimator suited to staggered adoption to estimate the average treatment effect, and uses the Sun & Abraham event-study approach to test parallel trends and trace dynamic effects; we control for patient composition and city-level covariates and include institution and month fixed effects; we further assess heterogeneity by hospital level and DRG design (point-based vs. rate-based). RESULTS: The two-stage DID results show that DRG implementation is associated with a significant reduction in the hospital–month mean of standardized spending (β = −705.7, p < 0.001) and a significant reduction in the standard deviation (β = −1,098.0, p < 0.001); meanwhile, distributional shape metrics increase, with higher skewness (β = 0.380, p < 0.05) and higher excess kurtosis (β = 18.3, p < 0.01). In the event-study analysis, most pre-reform lead coefficients fluctuate around zero and their confidence intervals largely cover zero, providing overall support for the parallel-trends assumption. Regarding heterogeneity, relative to secondary hospitals, the interaction term for tertiary hospitals is not statistically significant across all four outcomes. Relative to the rate-based design, the point-based design shows a significant mitigating effect on distributional shape metrics, reducing skewness (β = −0.941, p < 0.001) and excess kurtosis (β = −38.4, p < 0.001), while differences in the mean and standard deviation are not statistically significant. CONCLUSIONS: While DRG payment reform reduces the mean level of standardized inpatient spending and within-hospital spending variability, it may be accompanied by a strengthened right tail and heavier tails, indicating that “mean-based cost containment” is insufficient to characterize the reform’s structural consequences; distributional metrics and tail risk should be incorporated into routine performance assessment and regulatory frameworks. Compared with the rate-based approach, the point-based approach appears more advantageous in curbing heavy-tail indicators, underscoring the importance of settlement mechanisms and payment-parameter design for risk discipline. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s13561-026-00763-7.