Multi-View Transformers for Structure-Aware HA-NA Drift Risk Scoring and Mutation Hotspot Mapping

用于结构感知HA-NA漂移风险评分和突变热点映射的多视图Transformer

阅读:1

Abstract

Seasonal influenza A evolves quickly through mutations in haemagglutinin (HA) and neuraminidase (NA), which can reduce vaccine match and lower protection. Many sequence-only models do not link codon-level mutations to three-dimensional (3D) protein context and long-term evolutionary signals within one scoring framework. This study presents TRIAD-Influenza (TRIAD: Token-Residue-Integrated Architecture for Drift), a multi-view transformer that combines (i) codon- and residue-level sequence representations, (ii) structure-derived residue interaction features from predicted HA/NA models, and (iii) an embedding-space phylogeny that captures cluster and drift context. The pipeline curates more than 3×105 paired HA/NA coding sequences from the NCBI Virus resource (2010-2024) using strict quality control and codon-aware alignment and predicts 3D structures for nearly all unique HA and NA proteins to build contact graphs and surface/stability descriptors. TRIAD-Influenza outputs a continuous, structure-aware risk score for each HA/NA pair and produces interpretable mutation hotspot maps using gradient saliency and a contact-weighted mutation risk index (CMRI). On rolling-origin temporal cross-validation and for a temporally held-out internal test window with strong class imbalance (∼3.4% high-risk), the model shows strong ranking performance (AUROC ≈0.89; AUPRC ≈0.44; Brier score =0.069) while operating at surveillance speed (median latency ≈1.6 ms per HA/NA pair). External validation on independent GISAID/Nextstrain cohorts (2023-2024; 5000 isolates) preserves discrimination (AUROC ≈0.85-0.86). Predicted risk scores correlate with experimental haemagglutination inhibition (HI) antigenic distances (Spearman ρ up to ≈0.82 at the virus-aggregated level), and CMRI hotspots enrich known epitope and deep mutational scanning escape residues (odds ratios ≈2.7-3.6). Overall, token-residue-phylogeny coupling enables rapid, structure-aware prioritisation of emerging influenza A HA/NA sequences and delivers compact hotspot maps for expert review and targeted experiments.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。