Abstract
Soybean (Glycine max (L.) Merr.) is one of the most important global economic crops, extensively utilized in the production of food, animal feed, and industrial raw materials. As the demand for soybeans continues to rise, improving both the yield and quality of soybeans has become a central focus of agricultural research. To accelerate the genetic improvement of soybean, genome selection (GS) and genome-wide association studies (GWAS) have emerged as effective tools and have been widely applied in various crops. In this study, we conducted GWAS and GS model evaluations across five soybean phenotypes (Glycitin content, Oil, Pod, Total isoflavone content, and Total tocopherol content) to explore the effectiveness of different GWAS methods and GS models in soybean genetic improvement. We applied several GWAS methods, including fastGWA, BOLT-LMM, FarmCPU, GLM, and MLM, and compared the predictive performance of various GS models, such as BayesA, BayesB, BayesC, BL, BRR, SVR_poly, SVR_linear, Ridge, PLS_Regression, and Linear_Regression. Our results indicate that markers selected through GWAS, when used in GS, achieved a prediction accuracy of 0.94 at a 5 K density. Furthermore, Bayesian models proved to be more stable than machine learning models. Overall, this study offers new insights into soybean genome selection and provides a scientific foundation for future soybean breeding strategies.