Abstract
Hierarchical Reinforcement Learning (HRL) is effective for long-horizon and sparse-reward tasks by decomposing complex decision processes, but its real-world application remains limited due to instability between levels, inefficient subgoal scheduling, delayed responses, and poor interpretability. To address these challenges, we propose Timed and Bionic Circuit Hierarchical Reinforcement Learning (TBC-HRL), a biologically inspired framework that integrates two mechanisms. First, a timed subgoal scheduling strategy assigns a fixed execution duration τ to each subgoal, mimicking rhythmic action patterns in animal behavior to improve inter-level coordination and maintain goal consistency. Second, a Neuro-Dynamic Bionic Circuit Network (NDBCNet), inspired by the neural circuitry of C. elegans, replaces conventional fully connected networks in the low-level controller. Featuring sparse connectivity, continuous-time dynamics, and adaptive responses, NDBCNet models temporal dependencies more effectively while offering improved interpretability and reduced computational overhead, making it suitable for resource-constrained platforms. Experiments across six dynamic and complex simulated tasks show that TBC-HRL consistently improves policy stability, action precision, and adaptability compared with traditional HRL, demonstrating the practical value and future potential of biologically inspired structures in intelligent control systems.