Abstract
Water quality monitoring is critical for public health, ecology, and economic sustainability, but traditional methods are limited by temporal-spatial coverage and cost, failing to meet real-time assessment needs. Deep learning for water quality prediction is often hindered by high complexity and noise in raw time series. This study aims to address the high complexity and noise of hydrological time series by proposing a prediction framework integrating sliding window feature enhancement, principal component analysis (PCA), and a two-layer regularized gated recurrent unit (TLR-GRU). The core goal is to achieve high-precision real-time prediction of four key water quality parameters (dissolved oxygen (DO), ammonia nitrogen (NH3-N), total phosphorus (TP), and total nitrogen (TN)) for aquaculture and irrigation. Sample entropy (SampEn, m=2, r=0.2 × std(X)), a univariate complexity metric capturing intra-series pattern repetition, quantifies time series regularity, showing sliding windows reduce SampEn by filtering transient noise while retaining ecological patterns. This optimization synergizes with TLR-GRU's regularization (L2, Dropout) to avoid overfitting. A total of 4970 water quality records (2020-2023, 4 h sampling interval) were collected from a monitoring station in a typical aquaculture-irrigated water body. After dimensionality reduction via PCA, experimental results demonstrate that the TLR-GRU model outperforms six state-of-the-art deep learning models (e.g., TLD-LSTM, WaveNet) on both the base dataset and the sliding window-enhanced dataset. On the latter, DO and TP test set R2 rise from 0.82 to 0.93 and 0.81 to 0.92, with RMSE decreasing by 49.4% and 55.6%, respectively. This framework supports water resource management, applicable to rivers and lakes beyond aquaculture. Future work will optimize the model and integrate multi-source data.