Abstract
BACKGROUND AND AIMS: Annually, 4 million people are affected by ulcers in the GI tract, and poorly managed ulcers can lead to adverse events that may increase the risk of developing fatal diseases such as gastric cancer, GI bleeding, ulcerative colitis, and other conditions. Deep learning-based computer-aided diagnostics is essential for automating the diagnosis of such illnesses. It aids clinicians in diagnosis by reducing the time of diagnosis and lowering diagnostic errors. Data shortage is anticipated in endoscopy settings, and small data sets prevent the performance of the deep learning model from being generalized. The current study addresses data scarcity by expanding the available data through the artificial generation of new data. METHODS: This study exploits the combination of the generative adversarial network (GAN) and variational autoencoders (VAE) and develops a VAE-GAN architecture to generate artificial endoscopic images. VAE-GAN offers resilience toward mode collapse, vanishing gradient, instability, and non-convergence. The generated images were tested against the trained DenseNet121 model with a higher classification threshold for ulcer identification and estimated precision and recall scores that capture the quality and diversity of the generated images, respectively. Furthermore, the generated images were also presented and compared with the clinical interpretation by an expert. The effectiveness of data augmentation by VAE-GAN was also evaluated with the developed 5-layer convolutional neural network (CNN) model for ulcer classification. RESULTS: The proposed VAE-GAN architecture synthesized artificial realistic endoscopic images. With the trained DenseNet121 model under a higher threshold for ulcer detection, the generated images of ulcers achieved 99% precision and 92% recall. The expert achieved 57.1% classification accuracy for artificial versus real ulcer examination. The 5-layer CNN model achieved 94% classification accuracy on the test data set without data augmentation. With data augmentation, accuracy improved to 100%. CONCLUSIONS: The proposed VAE-GAN architecture produced promising results in synthesizing artificial endoscopic images. The higher precision and good recall showed that artificial images are high quality and diversified. The expert also infers that the generated images are of good quality. Hence, the proposed VAE-GAN architecture is expected to solve the problem of data insufficiency for training deep learning models to achieve generalized performances. In addition, data augmentation using VAE-GAN expanded the data set and reduced data biases to some extent, as indicated by the observed score with the 5-layer CNN model.