Abstract
SUMMARY: We introduce an in silico PCR (sPCR) method for the assembly of specific genomic regions spanned by PCR primers using raw sequence reads. This allows a user to quickly isolate the exact regions that are abundant in public archives of gene sequences, leveraging the decades of work that have gone into optimizing primer sequences for benchtop PCR. We implement sPCR in sharkmer as a targeted de Bruijn graph assembler seeded with the forward primer sequence and terminated with the reverse primer sequence. This is useful for a variety of routine tasks, including validating the species identity of a dataset, identifying contaminants, and quickly building phylogenies from raw sequence data. AVAILABILITY AND IMPLEMENTATION: sharkmer is written in Rust. Code, instructions for installation and use, tests, and other resources are available in the GitHub repository at https://github.com/caseywdunn/sharkmer and at Zenodo with DOI 10.5281/zenodo.19020708. It can also be installed via bioconda.