Accurate and efficient P-values for rank-based independence tests with clustered data using a saddlepoint approximation

利用鞍点近似法对聚类数据进行基于秩的独立性检验,得到准确高效的P值

阅读:1

Abstract

Accurate statistical inference for clustered data-common in multi-center clinical trials and longitudinal studies-poses significant challenges due to within-cluster correlation. Rank-based tests like the logrank, Wilcoxon, and Datta-Satten are valued for robustness but often suffer inflated Type I error rates under standard asymptotic approximations. While exact permutation tests offer theoretical accuracy, they are computationally impractical for large datasets, highlighting a methodological gap. This paper proposes a double saddlepoint approximation framework to deliver accurate p-values and confidence intervals for a wide class of rank-based tests. The method is built on a novel permutation distribution reformulation via block urn design, which preserves cluster integrity. This reformulation enables the test statistic's distribution to be represented as a sum of independent conditional random variables, from which a joint cumulant generating function can be derived for saddlepoint computation. The approach supports analyses with right-censored survival data and tied ranks. Extensive simulations confirm that the saddlepoint method accurately controls Type I error rates, performing identically to permutation-based benchmarks but with a vast reduction in computational cost. A case study on clinical trial data demonstrates the practical importance of this accuracy, showing how our approach avoids a potential false-positive conclusion reported by the standard asymptotic method. Ultimately, this research provides biostatisticians with a tool that is at once practical, efficient, and statistically rigorous for analyzing clustered data.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。