Abstract
Accurate detection of low-frequency mutations is crucial for understanding viral evolution and tumorigenesis in humans, but is often confounded by technical artifacts introduced during library preparation and sequencing. We present GENOMICON-Seq, an end-to-end simulation tool that models both amplicon and whole exome sequencing (WES) workflows with realistic biological mutations and technical noise. GENOMICON-Seq inserts ground truth mutations, ranging from APOBEC3-like edits to COSMIC single base substitution signatures, before subjecting samples to simulated PCR errors, probe-capture enrichment, and Illumina-specific sequencing biases. By tracking each mutation's origin (true or error-derived), researchers can pinpoint detection limits and optimize variant-calling thresholds. We illustrate GENOMICON-Seq's versatility through study cases involving human papillomavirus (HPV) amplicon sequencing, highlighting the impacts of polymerase fidelity, viral copy number, and read depth on detecting low-frequency mutations. In parallel, WES simulations demonstrate how capture biases and varying allele frequencies affect somatic mutation calls. GENOMICON-Seq is thus a flexible, reproducible framework for assessing new protocols, benchmarking variant callers, and refining data analysis pipelines, ultimately reducing costly trial-and-error in the laboratory. The Docker-based package is freely available at https://github.com/Rounge-lab/GENOMICON-Seq .