192 Prediction of fecal starch content of fattening cattle using near-infrared spectroscopy and machine learning

192 利用近红外光谱和机器学习预测育肥牛粪便淀粉含量

阅读:1

Abstract

Starch content in cattle feces can be used to predict starch digestibility. Near-infrared spectroscopy (NIRS) can be used to rapidly and non-chemically predict fecal starch content after a calibration model has been consequentially developed. Recent advancements in machine learning have introduced potent methods for developing optimized models. In addition, diverse preprocessing methods, such as scatter correction and smoothing, are available to eliminate irrelevant variance sources. This study aimed to identify optimal spectral preprocessing and regression methods for predicting fecal starch content using NIR spectra. NIR spectra were measured for 196 fecal samples collected from 9 fattening cattle farms. Fecal starch content on a DM basis was measured using enzymatic reactions and colorimetry. The 196 samples were divided into a calibration set (n = 123), a validation set (n = 29) comprising four farms, and an external validation set (n = 44) comprising five farms that differed from the calibration set. Based on the calibration set, two preprocessing methods were selected using the partial least squares (PLS) model. In method 1, the calibration models were developed by a grid search 5-fold cross validation on the 1,152 preprocessed calibration sets, and preprocessing with the lowest root mean square error (RMSE) of cross-validation among these calibration sets was selected. In method 2, the original calibration set (n = 123) was divided into a calibration set (n = 116) and a preprocessing selection set (n = 7) with different neutral detergent fiber and similar starch contents. After preprocessing, calibration models were developed as described in method 1. The preprocessing selection sets were then predicted by the developed models, and preprocessing with the lowest RMSE among these preprocessing selection sets was selected. Using the preprocessing methods selected in methods 1 and 2, the validation and external validation sets were predicted using seven regression models. As a result, the preprocessing selected by method 1 was in the wavelength range of 400–2500 nm, with robust normal variate (RNV) scatter correction and first-order derivative Savitzky-Golay filtering (2nd order polynomial, 2-nm window size). The preprocessing selected by method 2 had a wavelength range of 1500–2500 nm, RNV scatter correction, and 2nd-order derivative Savitzky-Golay filtering (3rd order polynomial, 25-nm window size). Among the seven regression models, the Lasso model was the most accurate for prediction. The Lasso models based on the preprocessing methods selected by methods 1 and 2 had RMSE of external validation = 2.054% of DM and 1.540% of DM, respectively. The feature importance of the Lasso model based on preprocessing method selected by methods 2 was higher at wavelengths of 1775 nm and 2325 nm. In conclusion, seven models were developed and evaluated based on diverse preprocessing methods using NIRS. The Lasso model exhibited the highest accuracy.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。