Recurrent Flow Update Model Using Image Pyramid Structure for 4K Video Frame Interpolation

基于图像金字塔结构的4K视频帧插值循环流更新模型

阅读:1

Abstract

Video frame interpolation (VFI) is a task that generates intermediate frames from two consecutive frames. Previous studies have employed two main approaches to extract the necessary information from both frames: pixel-level synthesis and flow-based methods. However, when synthesizing high-resolution videos using VFI, each approach has its limitations. Pixel-level synthesis based on the transformer architecture requires high complexity to achieve 4K video results. In the case of flow-based methods, forward warping can produce holes where pixels are not allocated, while backward warping approaches struggle to obtain accurate backward flow. Additionally, there are challenges during the training stage; previous works have often generated suboptimal results by training multi-stage model architectures separately. To address these issues, we propose a Recurrent Flow Update (RFU) model trained in an end-to-end manner. We introduce a global flow update module that leverages global information to mitigate the weaknesses of forward flow and gradually correct errors. We demonstrate the effectiveness of our method through several ablation studies. Our approach achieves state-of-the-art performance not only on the XTest and Davis datasets, which have 4K resolution, but also on the SNU-FILM dataset, which features large motions at low resolution.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。