Abstract
Mutational signatures are characteristic patterns of mutation frequencies assumed to be generated by specific mutagenic processes. A growing catalog of mutational signatures exists, but tools to systematically infer relationships between them remain limited. A mutational signature can be viewed as the combined outcome of two processes: DNA damage and DNA repair. Since cancer therapies often target DNA repair, inferring DNA repair pathways is important for treatment design, even when the mutagenic process is unknown. Here, we model the DNA repair step as a transformation, called RePrint, from damaged nucleotides to repair-related mutation patterns conditioned on the damage. We demonstrate that RePrint similarity is indicative of shared DNA repair mechanisms, enabling guilt-by-association prediction of DNA repair pathways. Using experimentally annotated signatures from environmental exposures and CRISPR gene knockouts as gold standards, we demonstrate that RePrint-based clustering consistently outperforms signature-based clustering across multiple evaluation metrics. We validate several guilt-by-association predictions with literature evidence, demonstrating RePrint's ability to identify shared repair mechanisms even among signatures with divergent mutational profiles. RePrint provides the first approach to systematically transfer DNA repair information between signatures, opening doors to understanding signatures of unknown origin and informing therapeutic strategies. An open-source implementation is available at https://github.com/wojtowicz-lab/RePrintPy.