Abstract
Tea (Camellia sinensis L.) disease detection in complex field conditions faces significant challenges due to the scarcity of labeled data. While current mainstream visual deep learning algorithms depend on large-scale curated datasets. To address this, we propose a novel few-shot end-to-end detection network called MAF-MixNet that achieves robust detection with minimal annotation data. The network effectively overcomes the bottleneck of insufficient feature extraction under limited samples of existing methods, through the design of a mixed attention branch (MA-Branch) and a multi-path feature fusion module (MAFM). The former extracts contextual features, while the latter combines and enhances the local and global features. The entire model uses a two-stage paradigm to pretrain on public datasets and fine-tune on balanced subset datasets, including novel tea disease classes, anthracnose, and brown blight. Comparative experiments with six models on four evaluation metrics verified the advancement of our model. At 5-shot, MAF-MixNet achieves scores of 62.0%, 60.1%, and 65.9% in precision, nAP50, and F1 score, respectively, significantly outperforming other models. Similar superiority is achieved in the 10-shot scenario, where nAP50 is 73.8%. Our model maintains a certain computational efficiency and achieves the second fastest inference speed at 11.63 FPS, making it viable for real-world deployment. The results confirm MAF-MixNet's potential to enable cost-effective, intelligent disease monitoring in precision agriculture.