SPDesign: protein sequence designer based on structural sequence profile using ultrafast shape recognition

SPDesign:基于结构序列特征和超快速形状识别技术的蛋白质序列设计器

阅读:1

Abstract

Protein sequence design can provide valuable insights into biopharmaceuticals and disease treatments. Currently, most protein sequence design methods based on deep learning focus on network architecture optimization, while ignoring protein-specific physicochemical features. Inspired by the successful application of structure templates and pre-trained models in the protein structure prediction, we explored whether the representation of structural sequence profile can be used for protein sequence design. In this work, we propose SPDesign, a method for protein sequence design based on structural sequence profile using ultrafast shape recognition. Given an input backbone structure, SPDesign utilizes ultrafast shape recognition vectors to accelerate the search for similar protein structures in our in-house PAcluster80 structure database and then extracts the sequence profile through structure alignment. Combined with structural pre-trained knowledge and geometric features, they are further fed into an enhanced graph neural network for sequence prediction. The results show that SPDesign significantly outperforms the state-of-the-art methods, such as ProteinMPNN, Pifold and LM-Design, leading to 21.89%, 15.54% and 11.4% accuracy gains in sequence recovery rate on CATH 4.2 benchmark, respectively. Encouraging results also have been achieved on orphan and de novo (designed) benchmarks with few homologous sequences. Furthermore, analysis conducted by the PDBench tool suggests that SPDesign performs well in subdivided structures. More interestingly, we found that SPDesign can well reconstruct the sequences of some proteins that have similar structures but different sequences. Finally, the structural modeling verification experiment indicates that the sequences designed by SPDesign can fold into the native structures more accurately.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。