Abstract
For the Choerospondias axillaris (Roxb.) B.L.Burtt & A.W.Hill, a significant economic tree in the Anacardiaceae family with industrial, medicinal, and ecological value, the genome size remains unreported. Here, we optimized the flow cytometry-based method for ploidy analysis, finding that WPB lysis solution proved to be the most effective. Analysis of 58 C. axillaris accessions identified 47 diploids and 11 triploids. The average genome size of diploids was estimated at 450.36 Mb. Illumina sequencing of a diploid (No.22) generated 81.98 Gb of high-quality data (224.44X depth). K-mer analysis estimated the genome size at 365.25 Mb, with 0.91% genome heterozygosity, 34.17% GC content, and 47.74% repeated sequences, indicating high heterozygosity and duplication levels in the genome. Genome assembly may necessitate a combination of second- and third-generation sequencing technologies. Comparative analysis with the NT database revealed that C. axillaris exhibited the highest similarity to C. axillaris (3.01%) and Pistacia vera (2.5%). This study establishes a crucial theoretical framework for C. axillaris genome sequencing and molecular genetics.