Abstract
In this work, an enhanced variant of the Adam optimizer, termed BDS-Adam, is proposed to address two critical limitations of the original Adam algorithm: biased gradient estimation and training instability during early optimization. To overcome these issues, a dual-path framework is adopted. In the first path, a nonlinear gradient mapping module (adaptive reshaping of raw gradients using hyperbolic tangent) is applied to adaptively reshape raw gradients, enabling the optimizer to better capture local geometric structures. In the second path, a semi-adaptive gradient smoothing controller-based on real-time gradient variance-is incorporated to suppress abrupt parameter updates and stabilize training dynamics. These two outputs are integrated through a gradient fusion mechanism (combining smoothed and transformed gradients before updates), in which smoothed and transformed gradients are combined prior to parameter updates. Moreover, an adaptive second-order moment correction technique is employed to mitigate cold-start effects caused by inaccurate variance estimates in the early training phase. A convergence analysis under non-convex settings is provided, and it is theoretically demonstrated that the expected gradient norm is bounded under standard assumptions, indicating improved robustness and long-term stability. This adaptive bias-correction formulation further improves training stability. Empirical evaluations on three benchmark datasets-CIFAR-10, MNIST, and a gastric pathology image dataset-reveal test accuracy improvements of 9.27%, 0.08%, and 3.00%, respectively, compared to Adam. These results confirm that the proposed dual-mechanism optimizer effectively enhances both convergence speed and generalization performance across diverse tasks.