Semantic mapping of Hindi text-to-image generation using CUB dataset

使用 CUB 数据集对印地语文本到图像的生成进行语义映射

阅读:1

Abstract

Generative Learning enables the generation of unseen data by learning the patterns from existing data. Text-to-Image generation (T2I) is one of the growing areas of generative learning that mainly focuses on creating realistic images from natural language descriptions. Although many models are available for generating images from text, there is a significant gap in regional languages as input, which restricts the capacity to produce realistic visuals based on regional textual descriptions. This paper proposes a semantic mapping of Hindi T2I generation using the Generative Adversarial Network (GAN). The regional T2I generation model, specifically trained on region-specific data, such as a Hindi language dataset, is prepared, pre-processed, and fed to the model. The study utilizes the Caltech-UCSD Birds 200 (CUB) dataset as its primary source. The experiments indicate that the model delivers well and produces robust images. Our model boosts the existing result significantly with an Inception score of 4.65, FID score of 37.17, and a first-of-kind semantic Alignment in Hindi text-image using R-precision score of 75.12.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。