Abstract
Soybean (Glycine max) seed coat color variation is determined by the accumulation of flavonoid-derived pigments, although the molecular mechanisms underlying this trait remain poorly understood. This study integrated RNA sequencing (RNA-Seq) and high-performance liquid chromatography (HPLC)-based metabolite measurements to investigate black and yellow seed coat soybean lines derived from the same genetic background. Metabolite analysis revealed significantly higher total phenolic content (TPC), total flavonoid content (TFC), total anthocyanin content (TAC), total proanthocyanidin content (TPAC), and antioxidant activity (DPPH, ABTS) in black seed coats, whereas yellow seed coats exhibited elevated total isoflavone content (TIC). RNA-seq at 110 days after sowing (DAS) identified differential expression of flavonoid pathway genes associated with these metabolic differences. Genes upregulated in black seed coats included flavanone 3-hydroxylase (F3H), anthocyanidin synthase (ANS), UDP-glycosyltransferases (UGT78D2, UGT79B6), and glutathione S-transferase (GSTF11), encoding enzymes reported to function in anthocyanin biosynthesis, glycosylation, and vacuolar transport, respectively. Conversely, leucoanthocyanidin reductase (LAR) genes showed higher expression in yellow seed coats despite lower proanthocyanidin (PA) levels, whereas LAC5 exhibited black seed-specific expression consistent with potential PA polymerization activity. R2R3-MYB transcription factor genes along with small heat shock protein genes (sHSPs) were also upregulated in black seed coats, suggesting candidate regulatory roles in pigmentation and stress responses. Cytochrome P450 genes showed preferential expression in yellow seed coats, consistent with isoflavonoid pathway activation. Together, these findings elucidate the genetic and metabolic regulation of seed coat color in soybean and identify candidate genes relevant for functional breeding and genomics research.