Abstract
Background: Insulin resistance (IR) is an underlying pathophysiology for type 2 diabetes (T2D). The Homeostasis Model Assessment of Insulin Resistance (HOMA-IR) is the simplest method for evaluating IR. At the same time, volatile organic compounds (VOCs) detected in human respiration can be correlated with specific diseases. To date, machine learning (Mach-L) has yet to be used to examine potential relationships between VOCs and IR. The present study has two aims: (1) to identify the VOCs most relevant to HOMA-IR, and (2) to use Shapley addictive explanation (SHAP) to determine the impacts of the distributions and directions of each feature in Taiwanese women. Methods: A total of 1432 Taiwanese women between the ages of 19 and 84 years were enrolled, and 344 VOCs were measured. Traditional multiple linear regression (MLR) was used as a benchmark for comparison, applying three Mach-L methods. Finally, SHAP was used to evaluate the directions of impacts of the features on HOMA-IR. Results: Six VOCs were identified as important: dimethylfuran, propanamine, aniline, butoxyethanol, and isopropyltoluene, in order from most to least important. SHAP found that dimethylfuran, isopropyltoluene, and dodecane were positively correlated to HOMA-IR, while butoxyethanol, aniline, and propanamine were negatively correlated. Conclusions: Using three different Mach-L methods, six VOCs were selected to be related to IR in Taiwanese women. According to their importance, dimethylfuran, propanamine, aniline, butoxyethanol, and isopropyltoluene could be used to help diagnose HOMA-IR. Furthermore, by using SHAP, dimethylfuran, isopropyltoluene, and dodecane had a positive and the other three had a negative influence.