Abstract
MOTIVATION: T cell receptors (TCRs) are fundamental components of the adaptive immune system, recognizing specific antigens for targeted immune responses. Understanding their sequence patterns is crucial for designing effective vaccines and immunotherapies. However, the vast diversity of TCR sequences and complex binding mechanisms pose significant challenges in generating TCRs that are specific to a particular epitope. RESULTS: Here, we propose TCR-epiDiff, a diffusion-based deep learning model for generating epitope-specific TCRs and predicting TCR-epitope binding. TCR-epiDiff integrates epitope information during TCR sequence embedding using ProtT5-XL and employs a denoising diffusion probabilistic model for sequence generation. Using external validation datasets, we demonstrate the ability to generate biologically plausible, epitope-specific TCRs. Furthermore, we leverage the model's encoder to develop a TCR-epitope binding predictor that shows robust performance on the external validation data. Our approach provides a comprehensive solution for both de novo generation of epitope-specific TCRs and TCR-epitope binding prediction. This capability provides valuable insights into immune diversity and has the potential to advance targeted immunotherapies. AVAILABILITY AND IMPLEMENTATION: The data and source codes for our experiments are available at: https://github.com/seoseyeon/TCR-epiDiff.