Abstract
In the present work, effective methods for determining the age of sauce-flavor Baijiu by multivariate data analysis and machine learning techniques were explored. Considering the complex and dynamic flavor changes during Baijiu storage, four analytical techniques, including gas chromatography-mass spectrometry (GC-MS), gas chromatography-ion mobility spectrometry (GC-IMS), electronic nose (E-nose) and electronic tongue (E-tongue) were integrated, to build a multilayered flavor profile of Baijiu. Four types of classification models were further constructed. The fusion data strategy combined with oversampling method of synthetic minority over-sampling technique (SMOTE) and neural network, significantly enhance the accuracy (0.96) and precision (0.97) of aged Baijiu determination (ranged from 1 year to 30 years). A total of 28 important features were screened out, including furfural, 2-hexanol (GC-MS), Area 65 (GC-IMS), and bitterness (E-tongue). Furthermore, potential correlations among different data sources were discussed. The astringency (E-tongue) showed a positive correlation with ethyl lactate (GC-MS) and Area 40 (GC-IMS).