Estimation of gross calorific value of coal based on the cubist regression model

基于三次回归模型的煤炭总热值估算

阅读:1

Abstract

The gross calorific value (GCV) of coal is an important parameter for evaluating coal quality, and regression analysis methods can be used to predict GCV. In this study, we proposed a GCV prediction model based on cubist regression. To develop a good regression model, feature selection of input variables was performed using a correlation analysis and a recursive feature elimination algorithm. Thus, in this study, we determined three sets of variables as the optimal combination for regression models: proximate analysis variables (Set 1: moisture, standard ash, and volatile matter), element analysis variables (Set 2: carbon, sulfur, and oxygen), and comprehensive index variables (Set 3: carbon, volatile matter, standard ash, sulfur, moisture, and hydrogen). Results for comparison with multiple linear regression, random forest regression, and numerous previous prediction models, such as gradient boosting regression tree, support vector regression (SVR), backpropagation neural networks, and particle swarm optimization-artificial neural network (PSO-ANN), indicate that these seven regression models have the best fitting effect on the comprehensive index variables among the three sets of input variables. The cubist model showed higher prediction accuracy and lower error than most other models (R(2), mean absolute error, root mean square error, and average absolute relative deviation percentage values are 0.990, 0.476, 0.668, and 0.086% for the proximate analysis variables; 0.992, 0.381, 0.596, and 0.140% for element analysis variables; and 0.999, 0.161, 0.219, and 0.087% for comprehensive index variables, respectively). The cubist model combines the advantages of decision tree and linear regression, which not only enables it to perform well in terms of accuracy but also makes the model highly interpretable because it is based on multiple sublinear equations. In addition, the cubist model shows obvious advantages in terms of running speed, especially compared with SVR and PSO-ANN, which require complex parameter optimization. In summary, the cubist model considers the prediction accuracy, model interpretability, and computational efficiency as well as provides a new and effective method for GCV prediction.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。