Abstract
OBJECTIVE: To address limitations in pulse oximetry accuracy associated with low saturation and melanin content using machine learning (ML) and to compare our results to standard pulse oximetry and gold standard arterial blood gas readings. BACKGROUND: Oxygen saturation is traditionally measured through the gold standard arterial blood gas (SaO(2)) or pulse oximetry (SpO(2)), which approximates SaO(2) using light absorption patterns. However, SpO(2) has been shown to overestimate oxygen saturation, particularly in individuals with darker skin tone, leading to hidden hypoxemia and delayed medical interventions. METHOD: We developed a machine learning (ML) model trained on the BOLD dataset, integrating patient data from eICU, MIMIC-III, and MIMIC-IV (n = 49,093). With 64 clinical features and 2 outcomes (SaO(2) and hidden hypoxemia events), we trained regression ML models (linear regression, random forest, and XGBoost) to predict SaO(2), minimizing the mean squared error between predicted SaO(2) and ground truth SaO(2). We used our test data to compare model performance to standard SaO(2) measurement with accuracy root mean square error (A(rms)), R(2), and change in hidden hypoxemia events. We used SHapley Additive exPlanations (SHAP) to rank important features for SaO(2) prediction. RESULTS: The XGBoost-Vanilla model improved A(rms) to 3.3% from a baseline of 4.1%. In low SaO(2) patients (SaO(2) < 90%), the accuracy of pulse oximetry was heavily compromised with A(rms) of 10.3%; the linear regression model with weighted loss was able to reduce A(rms) to 8.6%. We found that SpO(2), creatinine level, mean corpuscular hemoglobin level, respiratory rate, and age at admission were the leading features driving the SaO(2) prediction. CONCLUSION: These findings suggest that ML-based models can enhance the accuracy of the standard SpO(2). Further investigation is warranted to address SpO₂ inaccuracies and bias. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s13040-025-00511-3.