Abstract
INTRODUCTION: Working memory (WM) is a central cognitive constraint in second and foreign language learning, particularly in technology-enhanced instructional environments. While pre-AI computer-assisted language learning (CALL) research has examined how interactive technologies interact with individual differences in WM capacity, the rapid emergence of AI-mediated language learning tools raises new questions about how WM demands are managed, redistributed, or compensated. This review examines how WM has been conceptualized and empirically addressed across two historical eras of language learning technology. METHODS: This systematic review adopts a PRISMA 2020-compliant historical-comparative design and synthesizes 31 primary empirical studies, including 27 studies from the Interactive Era (2010-2024) and 4 studies from the AI-Mediated Era (2024-2025), supplemented by recent systematic reviews and theoretical work. Studies were analyzed within two analytically distinct corpora, focusing on instructional design features, WM-related outcomes, cognitive load management, and measurement approaches, followed by cross-era comparison guided by three research questions. RESULTS: Interactive Era studies show that CALL, multimedia, and online platforms provide multimodal input, adaptive feedback, collaboration, and flexible pacing, but frequently induce cognitive overload and unequal learning outcomes associated with individual differences in WM capacity, which is typically treated as a fixed learner constraint. In contrast, AI-mediated studies reveal a qualitative shift. AI-assisted writing reduces lower-level encoding demands while increasing central-executive demands for evaluation and integration; biometric-adaptive reading systems preemptively regulate cognitive load and improve comprehension; and AI-orchestrated VR-AR vocabulary instruction yields large gains only within empirically bounded multimodal channel limits. AI-mediated data-driven learning further offloads corpus search, reallocating WM resources toward noticing and internalization. DISCUSSION: Despite these advances, direct assessment of WM is largely absent from AI-mediated intervention studies, which rely on cognitive load proxies. This measurement gap limits causal inference regarding whether AI primarily reduces task demands, improves functional WM utilization, or supports WM capacity development. The review calls for future research to incorporate validated WM measures, adopt aptitude-treatment interaction designs, and establish evidence-based boundaries for AI-mediated multimodal adaptivity across diverse EFL and ESL contexts.