Abstract
Performance characteristics of a heterologously expressed enzyme may be critical for an application but they arise from the complex sequence, structure, function-dependency. Since the sequence-structure-function insights are limited, there is typically a need to screen a large number of sequence-similar enzyme variants. The scale of screening needed hinders the industrial strain development by laboratories smaller than biofoundries. We present an in silico method for refining an enzyme candidate set for experimental screening by structural similarity filtering. The method searches naturally occurring, non-redundant but functionally equivalent orthologs by combining typical homologous sequence search with filtering by structural similarity using AlphaFold-predicted protein structure models. We demonstrated the method for finding enzyme candidates similar to non-ribosomal peptide synthetases aspergillic acid synthetase (asaC) from Aspergillus flavus and chrysogine synthetase (chyA) from Penicillium rubens. The sequence similarity searches alone yielded tens of thousands of candidates of each kind. However, filtering by global structural similarity and the assessment of active site residues effectively narrowed down the enzyme candidates to 24 similar to asaC and one similar to chyA. Thus, filtering by structural similarity can efficiently refine enzyme candidate sets reducing the experimental screening effort and accelerating the development of strains for applications.