Abstract
Influence maximization (IM) seeks to identify a subset of key nodes that maximize the spread of information or behavior through a network. While traditional IM approaches rely on static topologies or coarse-grained temporal snapshots, real-world social systems evolve continuously, with highly dynamic and bursty interactions that invalidate static assumptions. Existing temporal methods discretize time into snapshots, losing fine-grained event dependencies and failing to model non-stationary interaction patterns that critically affect diffusion dynamics. To address this challenge, we propose TempRL-IM, a temporal reinforcement learning framework that integrates Continuous-Time Graph Neural Networks (CTGNNs) with a Double Deep Q-Network (DDQN) agent for influence maximization in dynamic networks. The CTGNN encoder captures fine-grained temporal dependencies through memory-based event updates and temporal attention, providing rich node embeddings that reflect evolving network states. The DDQN agent leverages these embeddings to learn optimal seed-selection strategies at a specified observation time. Unlike snapshot-based baselines, TempRL-IM maintains strict temporal causality and efficiently encodes continuous-time dynamics without discretization artifacts. Extensive experiments on six real-world temporal datasets demonstrate that TempRL-IM achieves 15-28% higher influence spread and 3-10× faster inference compared to state-of-the-art learning-based and heuristic methods. The framework also exhibits strong transferability across networks with similar temporal characteristics, highlighting its potential for large-scale applications such as viral marketing, epidemic containment, and information diffusion in dynamic social systems.