Abstract
Accurate identification of the geographical origin of tea leaves is crucial for ensuring quality assurance and traceability within the tea industry. This study introduces Origin-Tea, a novel lightweight convolutional neural network that innovatively combines depthwise separable convolutions with squeeze-and-excitation (SE) attention mechanisms to effectively capture subtle phenotypic variations while minimizing computational costs. Unlike prior approaches that depend on heavy architectures or handcrafted features, Origin-Tea is explicitly designed for efficiency and interpretability in agricultural applications. Comprehensive ablation studies confirm the significant contribution of each architectural component to the model's robust performance. The dataset comprises 900 high-resolution RGB images of Yunkang 10 tea leaves, independently collected from seven distinct regions in Yunnan Province. A 10-fold stratified nested cross-validation (CV) was employed, with one-fold designated for testing, one for validation, and the remaining eight for training in each iteration. Data augmentation techniques, including flipping, rotation, and exposure adjustments, were applied solely to the training set to enhance model robustness without compromising the intrinsic phenotypic features. Origin-Tea achieved an average overall accuracy (OA) of 0.92 ± 0.03 and a Kappa coefficient of 0.90 ± 0.03, outperforming the best-performing baseline, CoAtNet (OA = 0.89 ± 0.03), by 3.37% accuracy while reducing parameters by over 90% (1.7 M versus 17 M). Furthermore, in an independent test on 1788 scanner-captured images from four villages, Origin-Tea demonstrated excellent generalization with an OA of 0.97. These results highlight the model's potential as a scalable, field-deployable solution for intelligent tea provenance verification and precision phenotyping.