Accelerated Bayesian inference of population size history from recombining sequence data

利用重组序列数据加速贝叶斯推断种群规模历史

阅读:1

Abstract

I present PHLASH, a new Bayesian method for inferring population history from whole genome sequence data. PHLASH is population history learning by averaging sampled histories: it works by drawing random, low-dimensional projections of the coalescent intensity function from the posterior distribution of a PSMC-like model, and averaging them together to form an accurate and adaptive size history estimator. On simulated data, PHLASH tends to be faster and have lower error than several competing methods including SMC++, MSMC2, and FITCOAL. Moreover, it provides a full posterior distribution over population size history, leading to automatic uncertainty quantification of the point estimates, as well to new Bayesian testing procedures for detecting population structure and ancient bottlenecks. On the technical side, the key advance is a novel algorithm for computing the score function (gradient of the log-likelihood) of a coalescent hidden Markov model: when there are M hidden states, the algorithm requires 𝒪(M2) time and 𝒪(1) memory per decoded position, the same cost as evaluating the log-likelihood itself using the naïve forward algorithm. This algorithm is combined with a hand-tuned implementation that fully leverages the power of modern GPU hardware, and the entire method has been released as an easy-to-use Python software package.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。