Abstract
The multi-energy complementary power system achieves comprehensive and synergistic utilization of diverse energy sources, generating large-scale and distributed operational data. This introduces challenges in leveraging operational data for accurate and efficient carbon emission prediction. To effectively process the large-scale distributed operational data of power systems, identify key influencing factors, and achieve high-precision carbon emission prediction, this study investigates a carbon emission prediction method for multi-energy complementary power systems based on a multiple linear regression model. The structure of the multi-energy complementary power system is analyzed, and its carbon emission intensity is calculated. Based on the analysis results, preliminary selection of carbon emission influencing factors is conducted. A multiple linear regression model is constructed with the selected factors as independent variables and carbon emissions as the dependent variable. By performing significance tests on each independent variable, key influencing factors are identified, yielding an optimized multiple linear regression model. The model is integrated into the MapReduce parallel framework to expand computational scalability, enabling parallel processing of large-scale distributed power system data while ensuring prediction efficiency. The results demonstrate that the selected factor variables are reasonable, and the constructed prediction model exhibits a high goodness-of-fit. The prediction error ranges between 0.00516% and 0.00818%, confirming high accuracy and efficiency. The prediction results indicate that the experimental multi-energy complementary energy center's carbon emissions increase annually from 2025 to 2031 and gradually decline from 2031 to 2034. These findings provide a scientific basis for formulating carbon emission reduction policies in multi-energy complementary power systems.