Abstract
Durum wheat breeding increasingly requires tools that can anticipate how genotypes respond to the shifting mix of heat and drought typical of Mediterranean drylands. In this study, we explored whether a multimodal deep learning (MM-DL) framework-one that brings together genomic markers, environmental covariates (ECs), near-infrared spectroscopy (NIRS), and phenology-can improve predictions for five key yield-related traits: grain number per spike (GN), grain weight per spike (GW), number of spikelets per spike (NS), spike length (SL), and spike weight (SW). The evaluation relied on multi-environment data from three contrasting seasons, using two scenarios with direct relevance for breeding: predicting an entirely unseen season (Prediction of Year, PoY) and predicting new sowing environments within the same season (Prediction of Sowing Year, PoSY). Across all traits, integrating multiple data sources improved prediction accuracy relative to genomics alone, with gains varying according to the physiological basis of each trait. SL was consistently the easiest trait to predict (PoY ≈ 0.56-0.71; PoSY ≈ 0.74-0.80), followed by NS (PoY ≈ 0.23-0.47; PoSY ≈ 0.56-0.67). GW showed moderate and dependable accuracy (PoY ≈ 0.24-0.30; PoSY ≈ 0.34-0.46), while SW yielded intermediate values (PoY ≈ 0.27-0.38; PoSY ≈ 0.34-0.41). GN remained the most environmentally sensitive trait, with modest accuracy under PoY (0.28-0.32) and clearer gains within PoSY (0.43-0.53). ECs contributed most to improving cross-season transferability, whereas NIRS and phenology added smaller but still useful trait-specific signals. Overall, the results show that combining genomics with enviromics and phenomics produces more stable and biologically informed predictions. For durum wheat breeding under increasing climate variability, MM-DL offers a practical path toward more reliable selection decisions across seasons.