Trait Association for Flowering Time in Lentil from Global Multi-Environment Data Using GWAS and Machine Learning

利用全基因组关联分析和机器学习方法,基于全球多环境数据研究扁豆开花时间的性状关联

阅读:1

Abstract

Flowering time is an important developmental stage in plants, influenced by multiple genes and environmental factors. Understanding its genetic basis and interaction with the environment facilitates the development of improved varieties adapted to different environments. Conventional Genome-Wide Association Studies (GWAS) have been widely used to associate genetic markers with heritable traits, but they do not inherently capture interactions among single nucleotide polymorphisms (SNPs) or between SNPs and the environment. Machine Learning (ML) approaches can model these interactions and improve trait prediction even in the presence of noise and missing data. In this study, multi-environment lentil (Lens culinaris Medik.) data were analysed using GWAS and two widely used ML models, Random Forest and XGBoost, to identify genetic markers associated with flowering time. Model interpretability was enhanced using Explainable AI (XAI) techniques, including SHapley Additive exPlanations. GWAS identified eight significant loci across chromosomes one, two, five and seven, with the most significant SNP located at Chr2_530433205, while ML approaches identified nine markers on chromosomes one, two, three, five and seven, with the most significant SNP at Chr7_523220088. The majority of the identified markers were linked to candidate genes for flowering, while ML also identified potential epistasis. These findings highlight ML as a powerful complementary tool to GWAS for trait association.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。