Abstract
BACKGROUND: Hearing loss affects over 466 million individuals globally and is recognized as a major risk factor for Alzheimer's disease, yet treatment personalization remains limited due to the complexity and diversity of underlying causes. Current diagnostic and therapeutic approaches lack standardized methods to accurately predict the most appropriate intervention for individual patients. The integration of medical ontologies with machine learning offers a promising solution for enhancing diagnostic accuracy and treatment personalization. AIM: Our study aimed to (i) develop a Systematized Nomenclature of Medicine-Clinical Terms (SNOMED CT)-aligned clinical ontology for hearing loss using Semantic Web Rule Language for automated reasoning; (ii) implement a Random Forest classifier trained on ontology-enriched patient data to classify hearing loss types (conductive, sensorineural, mixed, or normal); and (iii) predict optimal personalized treatments based on laterality, severity, audiometric thresholds, and medical history using real-world patient data. METHODS: We developed a task ontology using Protégé 5.6.3 with Web Ontology Language (OWL), integrated SNOMED CT terminology alignment, and implemented Semantic Web Rule Language rules executed by the Pellet 2.2.0 reasoner. The framework was trained and evaluated on 3723 adult patients from the 2015-2016 National Health and Nutrition Examination Survey (NHANES) dataset with complete audiometric and clinical data. Random Forest models were developed using an 80-20 train-test split with stratified sampling and five-fold cross-validation. Performance was compared between K-Means clustering-based labeling and ontology-based semantic inference using accuracy, precision, recall, F1-score, and log loss metrics. RESULTS: The ontology successfully generated semantic labels for all 3723 patients, enabling precise classification of hearing loss types, severity levels, and laterality. The Random Forest model with K-Means clustering achieved a test accuracy of 90.2% with a log loss of 0.2766 and a cross-validation mean accuracy of 91.22% (standard deviation 1.2%). Integration of ontology-based semantic enrichment significantly improved performance, achieving a test accuracy of 92.48% with a cross-validation mean accuracy of 92.80% (standard deviation 0.9%). F1-scores improved across all classes, with mixed hearing loss showing a notable increase from 0.86 to 0.92. Feature importance analysis identified audiometric thresholds, ontology-derived severity labels, and medical history as top predictors, enhancing clinical interpretability. CONCLUSIONS: This study demonstrates that combining SNOMED CT-aligned ontology with Random Forest classification achieves superior diagnostic accuracy and enables personalized treatment recommendations for hearing loss. The hybrid framework provides clinically interpretable decision support while ensuring semantic interoperability with electronic health records. Multi-institutional validation studies are necessary to assess generalizability across diverse populations before clinical deployment.