Complete end-to-end learning from protein feature representation to protein interactome inference

从蛋白质特征表示到蛋白质相互作用组推断的完整端到端学习

阅读:2

Abstract

BACKGROUND: Co-fractionation coupled with mass spectrometry (CF-MS) is a powerful strategy for mapping protein-protein interactions (PPIs) under near-physiological conditions. Despite recent progress, existing analysis pipelines remain constrained by reliance on handcrafted features, sensitivity to experimental noise, and an inherent focus on pairwise interactions, which limit their scalability and generalizability. To address these difficulties, we introduce FREEPII (Feature Representation Enhancement End-to-End Protein Interaction Inference), a unified deep learning framework that integrates CF-MS data with sequence-derived features to learn biologically meaningful protein-level representations for accurate and efficient inference of PPIs and protein complexes. RESULTS: FREEPII employs a convolutional neural network architecture to learn protein-level representations directly from raw data, enabling feature sharing across interaction pairs and reducing computational complexity. To enhance robustness against CF-MS noise, protein sequences are introduced as auxiliary input to enrich the feature space with complementary biological cues. The supervised protein embeddings further encode network-level context derived from complex annotations, allowing the model to capture higher-order interactions and enhance the expressive power of protein representations. Extensive benchmarking demonstrates that FREEPII consistently outperforms state-of-the-art CF-MS analysis tools, capturing more biologically coherent and discriminative protein features. Cross-dataset evaluations further reveal that integrating multimodal data from diverse experimental contexts substantially improves the generalization and sensitivity of data-driven models, offering a scalable, cross-species strategy for reliable protein interaction inference. CONCLUSIONS: FREEPII provides a unified computational framework that integrates CF-MS data and sequence-derived features to learn discriminative and biologically consistent protein representations. By leveraging multimodal inputs through a coherent deep learning architecture, the model achieves accurate and scalable inference of PPIs and protein complexes across species. Its modality-aware design and supervised protein embeddings capture higher-order interaction contexts, ensuring robust generalization and reliable discovery of novel interactions. Overall, FREEPII offers a flexible and extensible foundation for data-driven exploration of protein interaction networks.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。