Machine learning and near-infrared fusion-driven quantitative characterization and detection of protein content in maize kernels

基于机器学习和近红外融合技术的玉米籽粒蛋白质含量定量表征与检测

阅读:5

Abstract

This study aims to develop a rapid and non-destructive method for determining protein content in maize using near-infrared spectroscopy (NIRS). To mitigate the effects of surface irregularities and uneven protein distribution in whole kernels on spectral measurements, maize powder was used as the test material to enhance the uniformity and stability of spectral signals. A total of 90 maize powder samples were collected from major production regions across China, and a custom NIRS acquisition system was constructed. To optimize the spectral data, eight preprocessing methods-including Multiplicative Scatter Correction (MSC), Standard Normal Variate (SNV), First Derivative (1D), Savitzky-Golay smoothing (S-G), and their combinations-were systematically evaluated. Subsequently, traditional machine learning models (Partial Least Squares Regression, PLSR; Support Vector Machine, SVM) and deep learning models (ResNet-18, Transformer) were developed to predict protein content, and their performances were compared. Results indicated that the combined preprocessing strategy of First Derivative and Multiplicative Scatter Correction (1D + MSC) was the most effective. Among the models, PLSR demonstrated the best predictive performance, and traditional chemometric methods showed greater practical utility compared to deep learning models. To further enhance model efficiency, four feature wavelength selection methods-Partial Least Squares Regression Coefficients (PLSRC), Competitive Adaptive Reweighted Sampling (CARS), Successive Projections Algorithm (SPA), and Uninformative Variable Elimination (UVE)-were applied. It was found that the PLSR model combined with the Successive Projections Algorithm (SPA) yielded the optimal performance, achieving a validation set correlation coefficient (R (p)) of 0.927, a root mean square error of prediction (RMSE(P)) of 0.301, and a residual predictive deviation (RPD) of 2.502, along with the fastest computational speed. This study provides a reliable technical solution and theoretical foundation for the rapid and non-destructive detection of protein content in maize, while also validating the advantage of using powdered samples in improving the accuracy of NIRS detection.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。