Abstract
Research on protein stability changes is vital for understanding disease mechanisms and optimizing industrial enzymes. Protein thermal stability can be modified by variants leading to changes in ΔΔG values between wild-type and mutant proteins. Despite advances, most models focus on single-point mutations, overlooking multipoint and indel mutations. Typically, the single-point mutation is expected to have a relatively limited impact on the function of a protein, necessitating more drastic modifications to meet new challenges. Current methods for multipoint mutations yield poor results, and no method exists for any length of indel mutations. To address this, we introduce UniMutStab, a shared-graph convolutional network leveraging protein language models and residue interaction networks to access any type of mutation. An embedded edge weight module enhances the integration of residue node features and interactions, improving prediction accuracy. Trained on the "Mega-scale" dataset with ~780 000 mutations, UniMutStab surpasses existing methods in predicting protein stability changes. It is a purely sequence-based approach to predict arbitrary mutation types, demonstrating robust generalization across multiple tasks and potentially contributing significantly to protein engineering, personalized therapeutics, and diagnostic methodologies.