Ultra-fast variant effect prediction using biophysical transcription factor binding models

利用生物物理转录因子结合模型进行超快速变异效应预测

阅读:2

Abstract

Sequence variation within transcription factor (TF)-binding sites can significantly affect TF-DNA interactions, influencing gene expression and contributing to disease susceptibility or phenotypic traits. Despite recent progress in deep sequence-to-function models that predict functional output from sequence data, these methods perform inadequately on some variant effect prediction tasks, especially with common genetic variants. This limitation underscores the importance of leveraging biophysical models of TF binding to enhance interpretability of variant effect scores and facilitate mechanistic insights. We introduce motifDiff, a novel computational tool designed to quantify variant effects using mono- and dinucleotide position weight matrices. motifDiff offers several key advantages, including scalability to score millions of variants within minutes, implementation of statistically rigorous normalization strategy critical for optimal performance, and support for both dinucleotide and mononucleotide models. We demonstrate motifDiff's efficacy by evaluating it across diverse ground truth datasets that quantify the effects of common variants in vivo, thereby establishing robust benchmarks for the predictive value of variant effect calculations. Finally, we show that our tool provides unique insights when interpreting human accelerated regions. motifDiff is available as a standalone Python application at https://github.com/rezwanhosseini/MotifDiff.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。