Abstract
BACKGROUND: Considering the potential importance of health care workers (HCWs) in maintaining and improving the health of society, we decided to investigate the factors affecting the health-related quality of life (HRQoL) of HCWs using machine learning. METHODS: This study is a cross-sectional, population-based study that used baseline data from the Shiraz University of Medical Sciences' Employees' Cohort (SUMSEC), which consisted of 7073 individuals aged 20 to 70 years. To more accurately identify determinants of HRQoL, we applied multiple linear regression along with some machine learning algorithms, including conditional tree, conditional forest, and random forest. Then, the fit of these methods was compared using mean square error (MSE), Root Mean Squared Error (RMSE), R(2), and Pearson correlation coefficient. RESULTS: On the test dataset, the multiple linear regression and conditional forest methods showed similar performance and produced more reliable predictions, with higher correlations and R² values, and lower MSE and RMSE than the random forest and decision tree methods. Moreover, the most important factors affecting the HRQoL of HCWs were sleep quality, underlying diseases, sex, and education. CONCLUSIONS: In our study, multiple linear regression and conditional forest performed equally well. Therefore, the associations between predictor variables and HRQoL were likely simple. In addition, demographic, clinical, and socioeconomic factors influence the HRQoL of HCWs. Recognizing and addressing these factors through targeted interventions and supportive policies can help improve the overall well-being and resilience of the HCWs.