Abstract
MOTIVATION: Proteins are dynamic systems whose function and behavior are sensitive to environmental conditions and often involve multiple cellular roles. Deep mutational scanning (DMS) experiments generate extensive datasets to capture the functional consequences of mutations. However, the sheer volume of data presents challenges in visualization and interpretation. Current approaches often rely on heatmaps, but these methods fail to capture the nuanced effects of amino acid substitutions, which are essential for understanding mutational impact. RESULTS: To address this, we extend the Rosace framework with Rosace-AA, a model that incorporates both position-specific information and amino acid substitution trends. Using substitution matrices like BLOSUM90, Rosace-AA infers an interpretable score from the raw counts of growth-based DMS data, on both protein-level and at the position-level while simultaneously inferring the effect of each variant. We demonstrate its utility across datasets, including OCT1 and MET kinase, showing that Rosace-AA highlights key positions where mutations deviate from expected substitution patterns and captures functionally relevant variation in protein behavior across multiple DMS screens. These results suggest that Rosace-AA enables more robust and interpretable analysis of complex DMS datasets. AVAILABILITY AND IMPLEMENTATION: An implementation of Rosace-AA as an R package and vignettes can be found at this repository: https://github.com/pimentellab/rosace-aa. Scripts for processing data and generating figures in this article are also available on GitHub (https://github.com/roserao/rosaceaa-paper-script).