Abstract
The geographical origin identification of wolfberry is key to ensuring its medicinal and edible quality. To accurately identify the geographical origin, the Stacking-Orthogonal Linear Discriminant Analysis (OLDA) algorithm was proposed by combining OLDA with the Stacking ensemble learning framework. In this study, Savitzky-Golay (SG) + Multiplicative Scatter Correction (MSC) served as the optimal preprocessing method. Four classifiers-K-Nearest Neighbors (KNN), Decision Tree, Support Vector Machine (SVM), and Naive Bayes-were used to explore 12 stacked combinations on 400 samples from five regions in Gansu: Zhangye, Yumen, Wuwei, Baiyin, and Dunhuang. When Principal Component Analysis (PCA), PCA + Linear Discriminant Analysis (LDA), and OLDA were used for feature extraction, Stacking-OLDA achieved the highest average identification accuracy of 99%. The overall accuracy of stacked combinations was generally higher than that of single-classifier models. This study also assessed the role of different classifiers in different combinations, finding that Stacking-OLDA combined with KNN as the meta-classifier achieved the highest accuracy. Experimental results demonstrate that Stacking-OLDA has excellent classification performance, providing an effective approach for the accurate classification of wolfberry origins and offering an innovative solution for quality control in the food industry.