Abstract
The structure and function of microbial communities are profoundly influenced by spatio-temporal dynamics. While existing machine learning algorithms are extensively used for phenotype prediction based on microbial communities, particularly for disease forecasting, they fail to fully utilize the spatio-temporal dynamics embedded in microbial data. Moreover, data collected at a single time point often proves inadequate for the accurate prediction of host or environmental phenotypes. This study investigates the interaction dynamics of microbial communities in closed environments using data from two independent research projects. We introduce the microbial spatio-temporal network model, which combines two-stream spatio-temporal graph convolutional networks with long short-term memory to predict dynamic microbial abundance in the human oral cavity and gut. The model captures the temporal trajectories of microbes together with spatial features embedded in network structures, enabling accurate prediction of future community trends. Experimental validation confirmed its ability to track temporal patterns with high accuracy, even for micro-organisms exhibiting significant fluctuations. Ablation experiments demonstrated that the integrated model outperforms individual components, harnessing the strengths of both approaches. This technology presents a promising strategy for low-cost, non-invasive early diagnosis of human diseases, offering valuable insights into future health risks.