Abstract
Maintaining cellular ploidy is critical for normal physiological processes, although gains in ploidy are frequently observed during development, tissue regeneration, and metabolism, and potentially contribute to aneuploidy, thereby promoting tumor evolution. Although numerous computational tools have been developed to estimate cellular ploidy from whole-genome sequencing (WGS) data at bulk or single-cell resolution, to the knowledge, no systematic comparison of their performance has been conducted. Here, a benchmarking study is presented of 11 methods for bulk WGS and 8 methods for single-cell WGS data, utilizing both experimental and simulated datasets derived from diploid cells mixed with aneuploid or polyploid cells. For bulk WGS tools, their performance is evaluated in estimating tumor purity and ploidy, as well as the influence of preprocessing steps, somatic mutation callers, purity, sequencing platforms, and depths. It is found that PURPLE outperforms other methods when tumor purity exceeded 30%, regardless of sequencing coverage or platform. However, all existing tools performed poorly applied to euploid samples or long-read sequencing data. For single-cell WGS tools, their ploidy detection accuracy is assessed, and SeCNV is identified as the top-performing method. These findings provide valuable guidance for future research on ploidy analysis and ongoing improvements in computational tools for single-cell sequencing data.