Abstract
Impatiens balsamina is a plant with notable medicinal, ornamental, and edible value. However, knowledge of its genome evolution and molecular pharmacognosy remains limited. Here, a multi-omics approaches including genome sequencing, transcriptome and metabolome profiling of roots, leaves, and flowers were integrated performed. A 691.61 Mb chromosome-level draft genome of I. balsamina is presented, with annotation revealing 32,949 protein-coding genes. It is proposed that two rounds of whole-genome duplication events may be major drivers of species diversity in Balsaminaceae lineages and Impatiens. A considerable number of beneficial secondary metabolites, including dihydrokaempferol, kaempferol, and five kaempferol derivatives (KKDs), accumulated at markedly different levels in the roots, leaves, and flowers of I. balsamina. The structural genes IbCHI, IbCHS, IbF3H, and IbFLS, as well as the glycosyltransferase IbUGT73C, which are involved in KKD biosynthesis, were identified. In addition, transcription factor genes from the WRKY, bHLH, MYB, and Myb-like families, and a P450 gene were suggested to directly regulate KKD biosynthesis, based on correlation analysis, WGCNA, and protein-protein interactions. Overall, these findings provide insights into the genome evolution and molecular pharmacognosy of I. balsamina and offer a foundation for breeding and drug development, not only for I. balsamina but also for other Impatiens species.