Fast3VmrMLM: A fast algorithm that integrates genome-wide scanning with machine learning to accelerate gene mining and breeding by design for polygenic traits in large-scale GWAS datasets

Fast3VmrMLM:一种快速算法,它将全基因组扫描与机器学习相结合,以加速大规模 GWAS 数据集中的多基因性状的基因挖掘和定向育种。

阅读:5

Abstract

Genetic dissection and breeding by design for polygenic traits remain substantial challenges. To address these challenges, it is important to identify as many genes as possible, including key regulatory genes. Here, we developed a genome-wide scanning plus machine learning framework, integrated with advanced computational techniques, to propose a novel algorithm named Fast3VmrMLM. This algorithm aims to enhance the identification of abundant and key genes for polygenic traits in the era of big data and artificial intelligence. The algorithm was extended to identify haplotype (Fast3VmrMLM-Hap) and molecular (Fast3VmrMLM-mQTL) variants. In simulation studies, Fast3VmrMLM outperformed existing methods in detecting dominant, small, and rare variants, requiring only 3.30 and 5.43 h (20 threads) to analyze the 18K rice and UK Biobank-scale datasets, respectively. Fast3VmrMLM identified more known (211) and candidate (384) genes for 14 traits in the 18K rice dataset than FarmCPU (100 known genes). Additionally, it identified 26 known and 24 candidate genes for seven yield-related traits in a maize NC II design; Fast3VmrMLM-mQTL identified two known soybean genes near structural variants. We demonstrated that this novel two-step framework outperformed genome-wide scanning alone. In breeding by design, a genetic network constructed via machine learning using all known and candidate genes identified in this study revealed 21 key genes associated with rice yield-related traits. All associated markers yielded high prediction accuracies in rice (0.7443) and maize (0.8492), enabling the development of superior hybrid combinations. A new breeding-by-design strategy based on the identified key genes was also proposed. This study provides an effective method for gene mining and breeding by design.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。