Abstract
100-seed weight (100-SW) is a critical determinant of soybean yield. The identification and functional characterization of its underlying genes are therefore essential for the genetic improvement of seed and yield-related traits. A residual heterozygous line (RHL) segregating for 100-SW was derived from a recombinant inbred line (RIL) population generated by crossing small-seed (19.75 ± 1.93 g) and large-seed (26.20 ± 0.82 g) soybean parents. Phenotypic segregation of 100-SW was analyzed, and Chi-square test was used to verify the segregation ratio. Bulked segregant analysis combined with whole-genome sequencing (BSA-seq) was performed using both Euclidean distance and index algorithms to map the target gene. Functional annotation, molecular marker validation, and germplasm resequencing were conducted to identify the key candidate gene and its haplotypes. Phenotypic analysis showed significant segregation and normal distribution of 100-SW in the RHL, with a Chi-square-verified 1:2:1 segregation ratio, indicating control by a single nuclear gene. BSA-seq mapped the gene to a 5.46 Mb region on chromosome 19, where 74 non-synonymous SNPs in coding sequences were identified (including one causing initiation codon loss), distributed across 36 genes. GmDt1 (Glycine max Determinant stem 1) was confirmed as the key candidate gene, with a G-to-T non-synonymous mutation in its first exon as the functional locus (validated in the original RIL population). Resequencing of diverse germplasm classified GmDt1 into five haplotypes; the large-seed haplotype GmDt1-H2 was absent in wild soybeans, present in 9.07% of landraces, and 15.83% of cultivated soybeans. The gradual increase in the frequency of GmDt1-H2 from wild to cultivated soybeans suggests that this haplotype has been positively selected during soybean breeding. Identification of GmDt1 and its functional mutation provides a valuable molecular target for the genetic improvement of soybean seed traits and yield.