Abstract
In this research, our objective was to utilize different machine learning techniques, such as XGBoost, Extra Trees, CatBoost, and Multiple Linear Regression (MLR), to model the heating values of municipal solid waste. The input parameters considered for the constructed models included the weight of the dry sample (kg) and the content of carbon (C), hydrogen (H), oxygen (O), nitrogen (N), sulfur (S), and ash in kg. The Extra Trees model, fine-tuned for hyperparameters, demonstrated outstanding performance, achieving R(2) values of 0.999 in the training set and 0.979 in the testing set. Notably, the model has shown robust accuracy, as evidenced by a low Mean Squared Error (MSE) of 77,455.92 on the testing dataset. Furthermore, the Mean Absolute Error (MAE) and Mean Absolute Percentage Error (MAPE) were 245.886 and 16.22%, respectively, further proving the model's substantial predictive accuracy and reliability. Although XGBoost and CatBoost demonstrated strong predictive capabilities with high R(2) values, Extra Trees outperformed them by achieving significantly lower error metrics. On the contrary, MLR, utilized as a conventional technique, demonstrated moderate performance, suggesting a distinct trade-off between explanatory power and predictive accuracy. In the feature importance examination of the optimal model, Extra Trees, nitrogen content emerged as the most impactful factor, succeeded by sulfur content, ash content, and dry sample weight in a descending hierarchy of significance.