AI-based disease category prediction model using symptoms from low-resource Ethiopian language: Afaan Oromo text

基于人工智能的疾病类别预测模型,使用资源匮乏的埃塞俄比亚语言的症状:Afaan Oromo 文本

阅读:1

Abstract

Automated disease diagnosis and prediction, powered by AI, play a crucial role in enabling medical professionals to deliver effective care to patients. While such predictive tools have been extensively explored in resource-rich languages like English, this manuscript focuses on predicting disease categories automatically from symptoms documented in the Afaan Oromo language, employing various classification algorithms. This study encompasses machine learning techniques such as support vector machines, random forests, logistic regression, and Naïve Bayes, as well as deep learning approaches including LSTM, GRU, and Bi-LSTM. Due to the unavailability of a standard corpus, we prepared three data sets with different numbers of patient symptoms arranged into 10 categories. The two feature representations, TF-IDF and word embedding, were employed. The performance of the proposed methodology has been evaluated using accuracy, recall, precision, and F1 score. The experimental results show that, among machine learning models, the SVM model using TF-IDF had the highest accuracy and F1 score of 94.7%, while the LSTM model using word2vec embedding showed an accuracy rate of 95.7% and F1 score of 96.0% from deep learning models. To enhance the optimal performance of each model, several hyper-parameter tuning settings were used. This study shows that the LSTM model verifies to be the best of all the other models over the entire dataset.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。