Abstract
Hawthorn is a well-known economic crop widely recognized for its efficacy in cardiovascular protection and blood pressure reduction. However, accurately identifying Hawthorn varieties, which arise from diverse cultivation conditions, poses a significant challenge in species authentication. To address this challenge, we introduce a visual feature-based method for Hawthorn identification. Specifically, we propose a multi-scale hybrid deep learning model to capture and merge both local and global features of Hawthorn images. Our model incorporates shallow prior and high-level semantic information, thereby enhancing classification precision. Furthermore, to improve the model ability to recognize local details in fine-grained images, we propose a novel spatial local attention mechanism. The loss functions are designed to reduce the low-frequency features in the fine-grained image. Extensive experiments conducted on our Hawthorn dataset, as well as two public datasets, demonstrate that our model outperforms state-of-the-art methods.