Abstract
The extensive deployment of quadrotors in complex environmental missions has revealed a critical challenge: degradation of trajectory tracking accuracy due to time-varying wind disturbances. Conventional model-based controllers struggle to adapt to nonlinear wind field dynamics, while data-driven approaches often suffer from catastrophic forgetting that compromises environmental adaptability. This paper proposes a reinforcement learning framework with continual adaptation capabilities to enhance robust tracking performance for quadrotors operating in dynamic wind fields. We develop a continual reinforcement learning framework integrating continual backpropagation algorithms with reinforcement learning. Initially, a foundation model is trained in wind-free conditions. When wind disturbance intensity undergoes gradual variations, a neuron utility assessment mechanism dynamically resets inefficient neurons to maintain network plasticity. Concurrently, a multi-objective reward function is designed to improve both training precision and efficiency. The Gazebo/PX4 simulation platform was utilized to validate the wind disturbance stepwise growth and stochastic variations. This approach demonstrated a reduction in the root mean square error of trajectory tracking when compared to the standard PPO algorithm. The proposed framework resolves the plasticity loss problem in deep reinforcement learning through structured neuron resetting, significantly enhancing the continual adaptation capabilities of quadrotors in dynamic wind fields.