INR Smooth: Interframe noise relation-based smooth video synthesis on diffusion models

INR Smooth:基于扩散模型的帧间噪声关系平滑视频合成

阅读:1

Abstract

The text-to-video generation task can provide people with rich and diverse video content, but it also has some typical issues, such as content inconsistency between video frames or text alignment failure, which degrade the smoothness of video. And in the process of improving the video smoothing problems, the background texture and artistic expression are often lost because of the excessive smoothing. Based on the above problems, this paper proposes INR Smooth, a type of video smoothing strategy based on the relationship between interframe noise, which can improve the smoothness of most T2V generation tasks. Based on INR Smooth, two video smoothing editing methods are proposed in this paper. One is for T2V training models, based on the studied interframe noise relationship, noise constraints are carried out from the beginning and end of the video simultaneously, and video smoothing loss functions are constructed. The other is for T2V training-free models, this paper introduces DDIM Inversion additionally to ensure text alignment, so as to improve the smoothness. Through experimental comparison, it is found that the proposed methods can significantly improve text alignment, temporal consistency, and has outstanding performance in the smooth transition of real scenes and the portrayal of artistic styles. The proposed training-free method and zero-shot fine-tuning training method for video smoothing do not add additional computing resources. The source codes and video demos are available at https://github.com/Cuihong-Yu/INR-Smooth.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。