Abstract
BACKGROUND: Plant-microbe interactions in the rhizosphere are central to plant growth, nutrient acquisition, and stress resilience. Although multi-omics approaches enable comprehensive profiling of different biological layers, integrating these data to understand the mechanisms underlying plant-microbe symbiosis, particularly under drought stress, remains a challenge. RESULTS: Genomic, metabolomic, and microbiome data from 198 soybean accessions grown under both control and drought conditions were integrated to identify environment-specific predictive features of the plant phenotypes. We compared best linear unbiased prediction (BLUP), genome-wide association study (GWAS), and a nonlinear machine learning model to evaluate their ability to detect informative features. The machine learning models provided flexible variable selection and outperformed linear models in capturing nonlinear dependencies. Model interpretation using SHapley Additive exPlanations (SHAP) indicated that the isoflavone derivative, daidzin, and the drought-tolerant Candidatus Nitrosocosmicus, were major contributors to phenotypic variation, specifically under drought stress. SHAP-based interaction networks indicated cross-omics links, including connections between daidzin, gamma-aminobutyric acid (GABA), and Paenibacillus. CONCLUSION: The proposed interpretable machine learning approach for plant phenotype prediction identified multi-omics biomarkers and interactions, providing insights into plant adaptation to drought stress through environment-dependent rhizosphere networks and symbiotic associations.