Abstract
OBJECTIVE: Large language models (LLMs) are attracting increasing interest in healthcare. This commentary evaluates the potential of LLMs to improve clinical prediction models (CPMs) for diagnostic and prognostic tasks, with a focus on their ability to process longitudinal electronic health record (EHR) data. FINDINGS: LLMs show promise in handling multimodal and longitudinal EHR data and can support multi-outcome predictions for diverse health conditions. However, methodological, validation, infrastructural, and regulatory challenges remain. These include inadequate methods for time-to-event modelling, poor calibration of predictions, limited external validation, and bias affecting underrepresented groups. High infrastructure costs and the absence of clear regulatory frameworks further prevent adoption. IMPLICATIONS: Further work and interdisciplinary collaboration are needed to support equitable and effective integration into the clinical prediction. Developing temporally aware, fair, and explainable models should be a priority focus for transforming clinical prediction workflow.