LANTERN: TCR-peptide binding prediction via large language model representations

LANTERN:基于大型语言模型表示的TCR肽结合预测

阅读:1

Abstract

Predicting T-cell receptor (TCR) and peptide-major histocompatibility complex (pMHC) interactions is critical for advancing targeted immunotherapies and personalized medicine. However, existing models often struggle with limited labeled data and poor generalization to novel epitopes. We present LANTERN (Large lAnguage model-powered TCR-Enhanced Recognition Network), a novel deep learning framework that combines pretrained protein and molecular language models with a cross-modality fusion mechanism. Specifically, LANTERN encodes TCR sequences using ESM and peptides as Simplified Molecular Input Line Entry System (SMILES) strings via MolFormer, capturing both evolutionary and chemical properties. A Multi-Head Cross-Attention (MHCA) module is introduced to align TCR and peptide representations, enabling the model to focus on interaction-relevant features across domains. This architecture improves generalization in zero-shot and few-shot scenarios. Extensive experiments on the TCHard benchmark demonstrate that LANTERN achieves competitive and robust performance compared with existing baselines, particularly under challenging random control and unseen epitope settings. These results highlight LANTERN's potential for robust TCR-pMHC binding prediction and downstream applications in personalized immunotherapy and vaccine development. For reproducing, our code is available at: https://anonymous.4open.science/r/LANTERN-87D9.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。