DCLTV: An Improved Dual-Condition Diffusion Model for Laser-Visible Image Translation

DCLTV:一种改进的用于激光-可见光图像转换的双条件扩散模型

阅读:1

Abstract

Laser active imaging systems can remedy the shortcomings of visible light imaging systems in difficult imaging circumstances, thereby attaining clear images. However, laser images exhibit significant modal discrepancy in contrast to the visible image, impeding human perception and computer processing. Consequently, it is necessary to translate laser images to visible images across modalities. Existing cross-modal image translation algorithms are plagued with issues, including difficult training and color bleeding. In recent studies, diffusion models have demonstrated superior image generation and translation abilities and been shown to be capable of generating high-quality images. To achieve more accurate laser-visible image translation, we designed an improved diffusion model, called DCLTV, which limits the randomness of diffusion models by means of dual-condition control. We incorporated the Brownian bridge strategy to serve as the first condition control and employed interpolation-based conditional injection to function as the second condition control. We also established a dataset comprising 665 pairs of laser-visible images to compensate for the data deficiency in the field of laser-visible image translation. Compared to five representative baseline models, namely Pix2pix, BigColor, CT2, ColorFormer, and DDColor, the proposed DCLTV achieved the best performance in terms of both qualitative and quantitative comparisons, realizing at least a 15.89% reduction in FID and at least a 22.02% reduction in LPIPS. We further validated the effectiveness of the dual conditions in DCLTV through ablation experiments, achieving the best results with an FID of 154.74 and an LPIPS of 0.379.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。