From noise to models to numbers: Evaluating negative binomial models and parameter estimations in single-cell RNA-seq

从噪声到模型再到数值:评估单细胞RNA测序中的负二项式模型和参数估计

阅读:3

Abstract

The Negative Binomial (NB) distribution is widely used to approximate transcript count distributions in single-cell RNA sequencing (scRNA-seq) data, yet the reason for its ubiquity is not fully understood. Here, we employ a computationally efficient model selection technique to map the relationship between the best-fit models - Beta-Poisson (Telegraph), NB, and Poisson - and the kinetic parameters that govern gene expression stochasticity. Our findings reveal that the NB distribution closely approximates simulated data (incorporating both biological and technical noise) within an intermediate range of the sum of the gene activation and inactivation rates normalized by the mRNA degradation rate. This range expands with decreasing mean expression, increasing technical noise, and larger sample sizes. The results imply that: (i) good NB fits occur in diverse parameter regimes without exclusively indicating transcriptional bursting; (ii) for small sample sizes, biological noise predominantly shapes the NB profile even when technical noise is present; (iii) under steady-state conditions, gene-specific parameters (burst size and frequency) estimated in regions where the NB model fits well, typically show large relative errors, even after corrections for technical noise, and (iv) gene ranking by burst frequency remains reliably accurate, suggesting that burst parameters are most informative in a relative sense. Finally, applying technical-noise-corrected model fitting to scRNA-seq data confirms that a substantial fraction of mammalian genes fall within these NB-fitting regimes, despite lacking transcriptional bursting.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。