Advances in next-generation sequencing technologies have vastly expanded the availability of diverse genomic, epigenomic and transcriptomic data, presenting the opportunity to develop a general AI model that integrates comprehensive genomic knowledge into a unified model. Unlike previous predictive models, which are typically specialized to certain tasks, our general AI model unifies a wide range of genomic modalities, such as nascent RNA and ultra-high-resolution chromatin organization, within a multi-task architecture. Using ATAC-seq and DNA sequences as inputs, we incorporated diverse genomic modalities as output, and the model exhibits strong generalizability across different cell types and tissues in all tasks we trained. It accurately predicts gene-level transcription measured by various nascent RNA assays, and effectively captures enhancer-associated transcription. Additionally, it also accurately captures the potential functions of non-coding genetic variants and regulatory elements. Additionally, we extended the model trained on human data to a mouse general model, achieving accurate predictions of genomic modalities, such as high resolution chromatin contact maps with limited data availability, which are further validated using an established mouse inner-ear study. This comprehensive approach offers a powerful tool for understanding genome regulation in both human and mouse species.
Developing a general AI model for integrating diverse genomic modalities and comprehensive genomic knowledge.
开发一种通用人工智能模型,用于整合各种基因组模式和全面的基因组知识
阅读:8
作者:Zhang Zhenhao, Bao Xinyu, Jiang Linghua, Luo Xin, Wang Yichun, Comai Annelise, Waldhaus Joerg, Hansen Anders S, Li Wenbo, Liu Jie
| 期刊: | bioRxiv | 影响因子: | 0.000 |
| 时间: | 2025 | 起止号: | 2025 May 14 |
| doi: | 10.1101/2025.05.08.652986 | 研究方向: | 人工智能 |
特别声明
1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。
2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。
3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。
4、投稿及合作请联系:info@biocloudy.com。
