CIRPIN: Learning Circular Permutation-Invariant Representations to Uncover Putative Protein Homologs

CIRPIN:学习循环排列不变表示以发现假定的蛋白质同源物

阅读:1

Abstract

Protein structure-based homology detection has been revolutionized by deep learning methods that can rapidly search massive databases. However, current structural search tools often miss proteins related by topological rearrangements, particularly circular permutation (CP), where proteins share identical global folds but differ in the positioning of their termini. We introduce a circular permutation-invariant graph neural network (CIRPIN) that addresses this limitation through a novel data augmentation strategy using synthetic circular permutations (synCPs). We demonstrate that CIRPIN learns representations of proteins that are invariant to circular permutation, enabling it to identify similar proteins within the Structural Classification of Proteins - extended (SCOPe) and AlphaFold Cluster Representatives (AFDB-ClustR) databases. Leveraging the speed of CIRPIN and the accuracy of traditional structural alignment tools, we search these databases and uncover thousands of novel protein pairs related by circular permutation. Notably, we discover that PDZ domains exist naturally in four circularly permuted forms. These results highlight CIRPIN as a powerful tool to investigate the emergence of circular permutations in nature.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。