Machine learning-based academic performance prediction with explainability for enhanced decision-making in educational institutions

基于机器学习的学业成绩预测及其可解释性,以增强教育机构的决策能力

阅读:1

Abstract

Education is crucial for the growth of effective life skills and the allocation of needed resources. Higher education institutions are adopting advanced technologies, such as artificial intelligence (AI), to enhance traditional teaching methods. Predicting academic performance has become increasingly important, improving university rankings and expanding student opportunities. This study addresses challenges in performance analysis, quality education delivery, and student evaluation through machine learning (ML) models. Ten regression models including K-Nearest Neighbors Regressor, Linear Regression, CatBoost, XGBoost, AdaBoost, and ensemble voting regression (VR) algorithm based on the top five heterogeneous regressors as base models are employed to predict academic outcomes. Two datasets with distinct feature sets and sizes were used to evaluate the generalizability of the models. The first dataset comprises 10,000 samples and six features focused on study behaviors, prior performance, and extracurricular activities. The second dataset includes 6,607 records and 20 features encompassing academic habits, demographic attributes, and institutional factors such as attendance, teacher quality, and parental involvement. Best model performance goes to the linear regression in standalone ML models. Then, the proposed ensemble VR model was built using weighted averages based on the performances of the base models. The local interpretable model-agnostic explanations (LIME) and SHapley Additive exPlanations (SHAP) are then used to explain the predictions produced by the proposed ensemble VR model. For the first dataset, the VR model achieved an RMSE of 0.1050, MAE of 0.0837, and R² of 0.9890. On the second, more complex dataset, the VR model also performed best with an R² of 0.7716 using the full feature set, highlighting its robustness and adaptability across diverse academic contexts. These results offer actionable insights for educators, administrators, and policymakers to better understand student performance drivers and support data-informed educational strategies.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。