An efficient method for statistical significance calculation of transcription factor binding sites

一种计算转录因子结合位点统计显著性的有效方法

阅读:2

Abstract

Various statistical models have been developed to describe the DNA binding preference of transcription factors, by which putative transcription factor binding sites (TFBS) can be identified according to scores assigned. Statistical significance of these scores, usually known as the p-value, play a critical role in identification. We developed an efficient algorithm to provide precise calculation of the statistical significance, remarkably enhancing the calculation efficiency by reducing the time complexity from an exponent scale to a linear scale, and successfully extended the application of this algorithm to a wide range of models, from the commonly used position weight matrix models to the complicated Bayesian Network models. Further, we calculated p-values of all transcription factor DNA binding sites recorded in the database, JASPAR, and based on these, we investigated some unseen properties of p-values as a whole, such as the p-value distribution of different models and the p-value variance according to changed scoring schemes. We hope that our algorithm and the result of computational experiments would offer an improved solution to the statistical significance of transcription factor binding sites. The software to implement our method can be downloaded from http://pcal.biosino.org/pCal.html.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。