Deciphering Sequence Determinants of Zygotic Genome Activation Genes: Insights From Machine Learning and the ZGAExplorer Platform

解读合子基因组激活基因的序列决定因素:来自机器学习和 ZGAExplorer 平台的启示

阅读:3

Abstract

The mammalian life cycle initiates with the transition of genetic control from the maternal to the embryonic genome during zygotic genome activation (ZGA), which becomes pivotal for development. Nevertheless, understanding the conservation of genes and transcription factors (TFs) that underlie mammalian ZGA remains limited. Here, we compiled a comprehensive set of ZGA genes from mice, humans, pigs, bovines and goats, including Nr5a2 and TPRX1/2. The identification of 111 homologous genes through comparative analyses was followed by the discovery of a conserved genetic coding region, suggesting potential sequence preferences for ZGA genes. Notably, an interpretable machine learning model based on k-mer core features showed excellent performance in predicting ZGA genes (area under the ROC curve [AUC] > 0.81), revealing abundant and intricate 6-base sequence specific patterns and potential binding TFs, including motifs from NR5A2 and TPRX1/2. Further analysis demonstrated that gene sequence features and epigenetic modification features play equally important roles in regulating ZGA genes. Ultimately, we developed the ZGAExplorer platform to provide an invaluable resource for screening ZGA genes. Our study unravels the sequence determinants of ZGA genes across species through multi-omics data integration and machine learning, yielding insights into ZGA regulatory mechanisms and embryonic developmental arrest.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。