Abstract
Diabetes is one of the global health challenges and requires early detection and an accurate diagnosis for the prevention of serious complications. Traditional methods struggle to handle the complexities of modern data sets. Advanced deep learning techniques can yield better solutions. This paper proposes a novel deep-learning framework optimized for diabetes prediction using the Pima Indian Diabetes Dataset. This is suggested to introduce the CatBoost algorithm and a deep learning architecture involving Convolutional Neural Networks (CNNs) and Bidirectional Long Short-Term Memory (Bi-LSTM) networks. Hyperparameter tuning was performed using the Mountain Gazelle optimizer (MGO) to balance exploration and exploitation in the search space effectively. It achieved the best performance, with an accuracy of 0.955, a precision of 0.96, a recall of 0.95, and an F1-score of 0.95, outperforming traditional algorithms such as Logistic Regression and Naive Bayes, which recorded accuracies of 0.775 and 0.78, respectively. Conversely, this proposed approach outperforms other deep learning methods, including CNNs and Bi-LSTMs, across multiple evaluation metrics, demonstrating strength and potential in clinical diagnostics. This enhances method interpretability; therefore, Recursive Feature Elimination (RFE) is an ideal candidate for medical applications in which clarity in decision-making is crucial.