Abstract
Diabetes remains a major public health challenge, contributing to complications such as kidney disease, cardiovascular disorders, and diabetic retinopathy. Early detection is essential for timely intervention, yet prediction from structured biomedical data is often hindered by limited sample size and feature diversity. This study investigates a deep learning framework that combines tabular-to-image transformation, pre-trained Convolutional Neural Networks, and Long Short-Term Memory (LSTM) networks to enhance diabetes prediction. Using the Pima Indians Diabetes Dataset, numerical features were transformed into 2D image representations based on correlation patterns and feature importance scores. Conditional Generative Adversarial Networks generated additional synthetic samples for training. Feature extraction was performed with DenseNet201, ResNet152, Xception, and EfficientNetB4, followed by classification using LSTM networks optimised via Bayesian search. In five-fold cross-validation, the deep learning pipeline achieved 94% accuracy and 98% AUC on the augmented PIMA dataset, showing improved performance compared to commonly reported benchmarks; however, these results may partially reflect the influence of synthetic data. When evaluated on the Frankfurt Diabetes Dataset, the model exhibited comparable performance, although the limited number of samples indicates that additional studies are required to firmly establish its generalizability. The proposed framework demonstrates promising performance for diabetes prediction from structured data. While the results suggest potential applicability to broader biomedical classification tasks, further validation on large, demographically diverse, and multi-institutional datasets is essential before considering any clinical translation.