Abstract
INTRODUCTION: The human oral cavity hosts a complex microbial ecosystem of bacteria, viruses, bacteriophages, and other microorganisms forming biofilms in different niches. Phage-bacteria host specificity is crucial in shaping microbial community, stability, and dysbiosis. mapping this specificity is limited by experimental constraints and traditional methods can't capture ecological complexity. The goal is to create a graph-based model that treats inter-microbial host specificity as a relational learning problem, integrating taxonomic, ecological, and infection data into a knowledge graph. This improves phage-bacteria host predictions and reveals microbial hubs and interaction patterns related to periodontal disease dysbiosis. METHODS: This study introduces a lightweight, relation-aware knowledge graph for predicting microbial host specificity in oral biofilms. We built a heterogeneous graph of the oral microbiome, incorporating microbial taxa, anatomical sites, taxonomic hierarchies, enrichment patterns, and INFECTS relationships. The dataset includes 500 viral taxa across four oral niches, with 21,338 significant co-occurrence relationships and various biological features. To learn meaningful representations, we combined graph embeddings with microbial features. We developed a relation-aware graph neural network, IK-BRNet, to efficiently encode ecological and interaction semantics. RESULTS: Model performance was evaluated against a conventional Graph Attention Network (GAT) using stratified training, validation, and test splits with class imbalance correction. IK-BRNet demonstrated faster convergence and superior discrimination ability, achieving a higher AUC-ROC (0.929 vs. 0.904) and markedly improved sensitivity for disease-associated viral taxa (93.8% vs. 56.3%). While the baseline GAT achieved higher accuracy and specificity, IK-BRNet consistently reduced false negatives, thereby improving its ability to detect disease-related microbial signals. Site-specific predictions confirmed biological validity, with the highest disease scores for dental plaque-associated viruses and lower scores in healthy niches such as the tongue and buccal mucosa. CONCLSUION: This study shows that relation-aware graph learning offers a meaningful and efficient way to model inter-microbial host specificity in oral biofilms. The framework improves oral microbiome network inference and supports disease screening, ecological analysis, and microbiome-based dentistry.