SSE-TSR: An Approach to Integrate Secondary Structure Elements Into Triangular Spatial Relationships for Protein Classification

SSE-TSR:一种将二级结构元素整合到三角形空间关系中以进行蛋白质分类的方法

阅读:2

Abstract

Protein structures are fundamental to understanding biological function, yet many detailed similarities remain hidden from conventional alignment-based or 3D superposition methods. Triangular Spatial Relationship (TSR) offers an alignment-free encoding of backbone geometry; however, classical TSR ignores the context of secondary structure elements (SSEs), such as helices, strands, and coils. To address this, we introduce SSE-TSR, which enriches each TSR key by categorizing it into one of 18 helix-strand-coil combination labels derived from DSSP-style annotations in PDB HELIX/SHEET records. By mapping the protein representation involving SSE-TSR keys into a sparse tensor, SSE-TSR compactly captures both tertiary geometry and local secondary motifs. We evaluated SSE-TSR on four datasets, two structural (CATH-based, 9.2 K; SCOP-based, 7.0 K) and two functional (published, 7.8 K; new, 7.2 K), using a 3D convolutional neural network. On structure-based tasks, SSE-TSR noticeably boosts accuracy from 96.00% to 98.33% (CATH-based) and from 95.46% to 99.00% (SCOP-based). On functional tasks, it yields modest yet consistent gains (e.g., from 99.41% to 99.50% and 95.83% to 98.83%). Comparisons to Foldseek confirm competitive accuracy across diverse tasks. Additionally, the sparse tensor representation enables memory-efficient handling of large-scale datasets, making SSE-TSR practical for extensive bioinformatics analyses. These results demonstrate SSE-TSR as a scalable, interpretable, and robust method, enhancing protein classification and structural bioinformatics.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。