Abstract
The shrinkage factor (λ) plays a critical role in shrinkage-based methods such as Ridge Regression Best Linear Unbiased Prediction (RRBLUP) for genomic selection. In these methods, λ controls the strength of penalization applied to marker effect estimates, thereby regulating model complexity and improving prediction accuracy (PA). This study evaluated eight approaches for estimating λ in RRBLUP, alongside the widely used BayesC (BC) method, across diverse genetic architectures. Direct approaches included cross-validation based on mean squared error (MSE-RRBLUP) and Pearson correlation coefficient (PCC-RRBLUP), as well as information criterion-based methods (AIC-RRBLUP, BIC-RRBLUP, DIC-RRBLUP). Indirect approaches, which infer λ from marker effect variances, included the total marker number (NM-RRBLUP), the sum of allelic frequencies (AF-RRBLUP), and BC (RRBLUP-BC). The simulated genome comprised six chromosomes, 1 Morgan each, with 100 quantitative trait loci (QTLs) randomly positioned on each chromosome. Four scenarios were considered: Scenario 1: 3000 markers, h² = 0.2; Scenario 2: 3000 markers, h² = 0.6; Scenario 3: 9000 markers, h² = 0.2; Scenario 4: 9000 markers, h² = 0.6. The PA, measured as the PCC between true (simulated) and genomic estimated breeding values, indicated that indirect approaches generally outperformed direct ones, except for PCC-RRBLUP, which performed comparably to AF-RRBLUP. In contrast, Information criterion-based direct methods exhibited the lowest PA. Estimated λ values decreased with increasing heritability and decreasing predictor number, regardless of approach. Large effect sizes (Cohen's d) confirmed the practical significance of differences between the best and worst methods. The maximum distance in PA were observed between BC and DIC-RRBLUP (d = 1.51) and BC and AIC/BIC-RRBLUP (d = 1.37) in Scenario 4, both statistically significant (P < 0.05). Among penalized parameter estimation methods, the largest differences occurred between PCC-RRBLUP and DIC-RRBLUP (d = 1.36) and between AF-RRBLUP and DIC-RRBLUP (d = 1.34) in Scenario 4, indicating very strong practical differences. In conclusion, the AF-RRBLUP approach combines high PA with low computational burden, making it a recommended option among the evaluated methods for genomic selection.