Abstract
Generative Learning enables the generation of unseen data by learning the patterns from existing data. Text-to-Image generation (T2I) is one of the growing areas of generative learning that mainly focuses on creating realistic images from natural language descriptions. Although many models are available for generating images from text, there is a significant gap in regional languages as input, which restricts the capacity to produce realistic visuals based on regional textual descriptions. This paper proposes a semantic mapping of Hindi T2I generation using the Generative Adversarial Network (GAN). The regional T2I generation model, specifically trained on region-specific data, such as a Hindi language dataset, is prepared, pre-processed, and fed to the model. The study utilizes the Caltech-UCSD Birds 200 (CUB) dataset as its primary source. The experiments indicate that the model delivers well and produces robust images. Our model boosts the existing result significantly with an Inception score of 4.65, FID score of 37.17, and a first-of-kind semantic Alignment in Hindi text-image using R-precision score of 75.12.