Abstract
Structural variants (SVs) contribute substantially to genomic variation and disease, but detecting somatic SVs (sSVs) remains difficult due to reference bias, mosaicism, and enrichment in repetitive regions. Linear reference genomes, like GRCh38 and CHM13, do not fully capture individual genomic structure, which can obscure true somatic variation. Donor-specific assemblies (DSAs) generated from the same genome where sSVs are being assayed provide a personalized alternative, yet their performance for sSV detection has not been systematically assessed. As part of the Somatic Mosaicism across Human Tissues (SMaHT) Network, we benchmark a DSA for sSV discovery in the COLO829 melanoma cell line with a matched normal sample from the same individual. We compare sSV detection across GRCh38, CHM13, and the COLO829BL_DSA using three different sSV callers (Delly, Severus, and Sniffles2) and sequence data from multiple long-read platforms. The COLO829BL_DSA identifies 1.8-fold more manually validated sSVs than linear references, in regions both shared with GRCh38 and CHM13 and unique to the COLO829BL_DSA. Variants detected only with the COLO829BL_DSA are often found in satellite and other repeat-rich regions that are difficult to resolve using standard references. In addition, several COLO829BL_DSA-specific sSVs are located in genes, some of which are associated with cancer. Overall, these results underscore the utility of DSAs in improving sSV detection.