Multivariate Poisson lognormal distribution for modeling counts from modern biological data: An overview.

阅读:13
作者:Subedi Sanjeena, Dang Utkarsh J
Modern biological data are often multivariate discrete counts, and there has been a dearth of statistical distributions to directly model such counts in an efficient manner. While mixed Poisson distributions, e.g., negative binomial distribution, are often the distribution of choice for univariate data, multivariate statistical distributions and their algorithmic implementations tend to have different drawbacks, e.g., non-tractable distributions, non-closed form solutions for parameter estimates, constrained correlation structures, and slow convergence during iterative parameter estimation. Herein, we provide an overview of the Poisson lognormal and multivariate Poisson lognormal distributions. These distributions can be written in an hierarchical fashion. An efficient variational approximation-based parameter estimation strategy as well as a hybrid approach for full Bayesian posterior estimation is available for such models, allowing for scaling up and modeling high-dimensional data. We provide comparisons of the univariate Poisson, the negative binomial, and the Poisson lognormal distributions in terms of the estimated mean-variance relationships using simulations and example real datasets. We also discuss the properties of the multivariate Poisson lognormal distribution, and ability to directly model count data including zero counts, over-dispersion, both positive and negative covariance elements, and the mapping from correlations in the latent space vs. the observed space. Finally, we illustrate their use through two model-based clustering examples using a mixtures of distributions approach in RNA-seq and microbiome data.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。