Uncovering dormancy stage predictors in sweet cherry through DNA methylation and machine learning integration

通过DNA甲基化和机器学习整合揭示甜樱桃休眠期预测因子

阅读:1

Abstract

BACKGROUND: Prunus Avium L. dormancy is a complex physiological process that allows floral outbreaks to survive adverse winter conditions and resume favorable spring growth. Traditional phenological evaluations and agroclimatic models, although widely used, exhibit limited resolution and robustness over the years and cultivars. Epigenetic mechanisms, particularly DNA methylation, have emerged as critical regulators of dormancy transitions. However, the integration of methylation data with automatic learning tools (ML) for predictive modeling remains largely unexplored in perennial species. This study presents an integrative frame that combines whole-genome bisulfite sequencing and supervised ML to identify methylation markers at the cytosine and region level associated with specific dormancy stages in the sweet cherry. METHODS: DNA methylation data sets from three different experiments underwent classification using Random Forest (RF) and eXtreme Gradient Boosting (XGBoost), complemented by SHapley Additive exPlanations (SHAP) for interpretability. The importance of the features was evaluated using the Integrated Model consensus in the RF, XGBoost, and SHAP metrics. RESULTS: The selection of features significantly improved the classification performance in the three-stages models (paradormancy, endodormancy, ecodormancy) and two-stages (endodormancy and ecodormancy). RF constantly exceeded XGBoost, achieving an accuracy of up to 97.1% in the two-stages scenario using informative cytosine level data. The SHAP analyses demonstrated that the selected feature effectively discriminated among stages of dormancy and revealed biologically significant epigenetic features. The key features were distributed not random throughout the genome, often colocalizing with transposable elements of long terminal repetition (LTR), particularly LTR/ty3-retrotransposons and LTR/copia families. Some features also co-localize with QTLs for chilling and heat requirement, flowering time and maturity date previously identified. CONCLUSIONS: This study highlights the usefulness of combining high-resolution methylation data with interpretable ML techniques to identify robust dormancy biomarkers. The enrichment of the features associated with dormancy within the transposable elements and the proximal regions of genes suggests an epigenetic regulation through the remodeling of chromatin mediated by TE. These findings contribute to a deeper understanding of dormancy mechanisms and offer a basis for the development of non-destructive tools based on methylation to improve phenological management in perennial fruit crops.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。