Abstract
PURPOSE: The widespread adoption of electronic medical records (EMR) has facilitated the prediction of patient prognosis and disease progression, yet inherent issues such as irregular sampling and missing values continue to pose challenges for clinical time-series analysis. This study aims to develop a robust framework capable of effectively handling incomplete EMR data while capturing complex temporal patterns and feature interaction. METHODS: We propose MedGAITS, a novel two-stage graph autoencoder framework for irregular and incomplete clinical time series. The model employs a progressive learning strategy: the first stage performs coarse-grained reconstruction of the zero-filled input via dynamic graph learning, while the second stage refines this representation to extract deep, robust features. Through iterative dynamic graph construction and residual-style information propagation, MedGAITS learns uncertainty-aware representations directly from raw, partially observed data, avoiding the biases introduced by explicit imputation. RESULTS: MedGAITS achieved competitive or superior performance compared to state-of-the-art models across multiple public datasets (PhysioNet 2012, COVID-19, and eICU) in both regression and classification tasks. Meanwhile, MedGAITS provides clinically interpretable insights into COVID-19 progression, identifying neutrophils and LDH as early biomarkers and white blood cell count as a later-stage indicator, thereby characterizing the disease's temporal profile. CONCLUSION: MedGAITS provides an effective solution for handling irregular clinical time-series data with missing values. Its two-stage imputation-and-representation learning design not only improves performance in downstream predictive tasks but also helps uncover clinically meaningful, time-evolving features, offering valuable insights for disease monitoring and biomarker discovery. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1007/s13755-026-00434-1.