Abstract
OBJECTIVE: This study aimed to determine spirometric norm values for the healthy Iranian adult population and compare them with established norm equations, specifically the GLI-Caucasian and Iranian equations. METHODS: During the recruitment phase of the Shahedieh Prospective Epidemiological Research Studies in Iran (PERSIAN) in 2016, spirometric parameters of 998 participants were obtained. KNN regression was used to extract reference values for spirometric parameters FEV(1), FVC, FEV(1)/FVC, and FEF(25-75%), considering height and age as features. The performance of KNN regression was compared with conventional models used in previous studies, such as the multiple linear regression (MLR) model and the Lambda-Mu-Sigma (LMS) model. The predicted values were compared with those obtained from the GLI-Caucasian and Iranian equations. The validation criterion was the mean squared error (MSE) based on 5-fold cross-validation. RESULTS: This study included 473 female participants and 525 male participants. KNN regression provided more accurate predictions for four spirometric parameters than MLR and LMS. The MSE for predicting FVC in female participants was 0.159, 0.169, and 0.165 in KNN regression, MLR, and LMS, respectively. The predictions of the present study were closer to the actual values of the reference population for four indicators compared to the prediction values using two sets of reference equations. The MSE of predicted FVC for female participants was 0.159 in the present study, which was less than the Iranian (MSE = 0.344) and GLI-Caucasian (MSE = 0.397) equations. CONCLUSION: Using a flexible machine learning approach, this study established spirometry reference values specifically for the Iranian population. Recognizing that spirometry reference values vary among different populations, the Excel calculator developed in this research can be a valuable tool in healthcare centers for assessing lung function in Iranian adults.