Supervised learning of enhancer-promoter specificity based on genome-wide perturbation studies highlights areas for improvement in learning

基于全基因组扰动研究的增强子-启动子特异性监督学习突显了学习中需要改进的领域

阅读:1

Abstract

MOTIVATION: Understanding the rules that govern enhancer-driven transcription remains a central unsolved problem in genomics. Now with multiple massively parallel enhancer perturbation assays published, there are enough data that we can utilize to learn to predict enhancer-promoter (EP) relationships in a data-driven manner. RESULTS: We applied machine learning to one of the largest enhancer perturbation studies integrated with transcription factor (TF) and histone modification ChIP-seq. The results uncovered a discrepancy in the prediction of genome-wide data compared to data from targeted experiments. Relative strength of contact was important for prediction, confirming the basic principle of EP regulation. Novel features such as the density of the enhancers/promoters in the genomic region was found to be important, highlighting our lack of understanding on how other elements in the region contribute to the regulation. Several TF peaks were identified that improved the prediction by identifying the negatives and reducing False Positives. In summary, integrating genomic assays with enhancer perturbation studies increased the accuracy of the model, and provided novel insights into the understanding of enhancer-driven transcription. AVAILABILITY AND IMPLEMENTATION: The trained models, data, and the source code are available at http://doi.org/10.5281/zenodo.11290386 and https://github.com/HanLabUNLV/sleps.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。