Abstract
Mathematical models that describe sequence-function relationships are widely used in computational biology. A key challenge when interpreting these models is that their parameters are not uniquely determined: many different parameter choices can encode the same sequence-function landscape. These ambiguities, known as "gauge freedoms," must be resolved before parameter values can be meaningfully interpreted. Resolving gauge freedoms requires imposing mathematical constraints on parameters that remove these degrees of freedom, a procedure called "fixing the gauge." We recently developed mathematical methods for fixing the gauge of a large class of commonly used models, but the direct computational implementation of these methods is often impractical due to the need for projection matrices whose memory requirements scale quadratically with the number of parameters. Here we introduce GaugeFixer, a Python package that exploits the specific mathematical structure of gauge-fixing projections to achieve linear scaling, thus enabling application to models with millions of parameters. To demonstrate GaugeFixer, we analyze the local structure of peaks in an empirical fitness landscape for translation initiation. GaugeFixer reveals striking similarities, but also fine-scaled variation, in ribosome binding preferences at different positions relative to the start codon, thereby facilitating the interpretation of an otherwise unwieldy fitness landscape. GaugeFixer thus fills an unmet need in the computational tools available for biologically interpreting sequence-function relationships.