Abstract
Deep learning has rapidly emerged as a promising toolkit for protein optimization, yet its success remains limited, particularly in the realm of activity. Moreover, most algorithms lack rigorous iterative evaluation, a crucial aspect of protein engineering exemplified by classical directed evolution. This study introduces DeepDE, a robust iterative deep learning-guided algorithm leveraging triple mutants as building blocks and a compact library of ∼1,000 mutants for training. Triple mutants allow for the exploration of a much greater sequence space compared to single or double mutants in each iteration. When applied to GFP from Aequorea victoria, DeepDE achieved a remarkable 74.3-fold increase in activity over four rounds of evolution, far surpassing the benchmark superfolder GFP. Our study suggests that limited screening involving experimentally affordable ∼1,000 variants significantly enhances the performance of DeepDE, likely by mitigating the constraints imposed by the intractable data sparsity problem in protein engineering.