Applying traditional and machine learning-based GWAS approaches for marker-trait identification in wheat

应用传统方法和基于机器学习的全基因组关联分析方法进行小麦标记性状鉴定

阅读:2

Abstract

INTRODUCTION: Complex traits arise from polygenic and interactive genomic architectures that are difficult to resolve using traditional genome-wide association study (GWAS) approaches. Machine learning (ML) provides complementary methods capable of capturing non-linear effects, improving signal detection, and enhancing predictive accuracy of marker trait associations (MTAs). METHODS: Using a publicly available winter wheat dataset (CIMMYT), we evaluated several widely used traditional GWAS tools, including GAPIT, GCTA, GEMMA, sommer, and TASSEL, with respect to computational efficiency, model performance, and the consistency of detected associations. In parallel, ML approaches, such as Elastic Net, Extreme Gradient Boosting (XGBoost), Random Forest, and the hybrid TSLRF model, were assessed based on feature importance metrics and functional annotation of selected markers. RESULTS: Despite a shared reliance on mixed linear models, the traditional GWAS tools exhibited differences in runtime and showed modest but meaningful variability in the number and overlap of MTAs. ML models recovered several associations detected by traditional methods and additionally identified novel markers, potentially reflecting non-linear or epistatic effects. DISCUSSION: Our findings demonstrate that ML can effectively complement traditional GWAS approaches for marker-trait identification in wheat. By extending beyond additive effects, ML broadens the scope of detectable genetic signals, providing a practical way to analyze complex traits and support informed marker-assisted breeding strategies.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。