Abstract
All cell lineages accumulate mutations over time, increasing the probability that some lineages eventually become malignant. Many of the processes responsible for generating mutations leave a characteristic footprint in the genome that allows their presence to be detected. However, the mutational pattern in a tumour is usually the combined result of multiple mutational processes being at work simultaneously, and the problem of disentangling the different footprints and their relative impact then becomes a deconvolution problem. Several algorithms have been developed for this purpose, most of them involving a factorisation of the mutation count matrix into two non-negative matrices, representing, respectively, the underlying mutational signatures and the relative exposure to these signatures. Here, we introduce the StarSignDNA algorithm for mutational signature analysis, which offers efficient re-fitting and de novo mutational signature extraction. StarSignDNA performs mutation-to-signature attribution, assigning each mutation to its most probable causative mutational signature(s) based on the mutation's trinucleotide context and the signature profiles. StarSignDNA demonstrates robust and clinically relevant performance in low-mutation contexts, with particular strengths in re-fitting analysis and balanced performance in de novo signature discovery. The package offers a command-line-based interface and data visualisation routines. The package is available at https://github.com/uio-bmi/StarSignDNA and can be installed via PyPI.