Pseudobulk with proper offsets has the same statistical properties as generalized linear mixed models in single-cell case-control studies

在单细胞病例对照研究中,具有适当偏移量的伪批量模型与广义线性混合模型具有相同的统计特性。

阅读:1

Abstract

MOTIVATION: Generalized linear mixed models (GLMMs), such as the negative-binomial or Poisson linear mixed model, are widely applied to single-cell RNA sequencing data to compare transcript expression between different conditions determined at the subject level. However, the model is computationally intensive, and its relative statistical performance to pseudobulk approaches is poorly understood. RESULTS: We propose offset-pseudobulk as a lightweight alternative to GLMMs. We prove that a count-based pseudobulk equipped with a proper offset variable has the same statistical properties as GLMMs in terms of both point estimates and standard errors. We confirm our findings using simulations based on real data. Offset-pseudobulk is substantially faster (>×10) and numerically more stable than GLMMs. AVAILABILITY AND IMPLEMENTATION: Offset pseudobulk can be easily implemented in any generalized linear model software by tweaking a few options. The codes can be found at https://github.com/hanbin973/pseudobulk_is_mm.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。