Abstract
The chemical phenotype (chemotype) of Cannabis sativa is defined by the ratio of cannabidiolic acid (CBDA) to Δ9-tetrahydrocannabinolic acid (THCA). Although the Mendelian segregation of these traits suggests a single-locus biallelic system, recent sequencing and phylogenetic evidence indicate they are encoded by two distinct, tightly linked genes. The precise genomic architecture of this region, known as the B locus, has remained poorly defined. In this study, we analyzed recently released high-quality Cannabis reference genomes to resolve the structure of the B locus. Our results demonstrate that this region functions as a supergene, characterized by suppressed recombination that facilitates Mendelian-like switching between phenotypic states. Comparative genomic analysis reveals substantial structural polymorphism within the locus, including significant variations in gene copy number and large-scale insertions/deletions (indels). Furthermore, we functionally characterized three previously unstudied members of the cannabinoid oxidocyclase family. We find that these enzymes primarily catalyze the production of cannabichromenic acid (CBCA), reinforcing the model that cannabinoid profile is dictated specifically by the presence and expression of THCAS or CBDAS. Finally, we mapped the expression profile of the entire berberine bridge enzyme (BBE) family, identifying widespread expression across plant tissues, including in glandular trichomes. Collectively, these findings resolve the genomic architecture of the B locus, clarify the enzymatic basis of cannabinoid profile determination, and establish a framework for understanding the evolutionary maintenance of chemotype diversity in C. sativa.