VFIformer
Video Frame Interpolation with Transformer (CVPR2022)
Install / Use
/learn @JIA-Lab-research/VFIformerREADME
VFIformer
Official PyTorch implementation of our CVPR2022 paper Video Frame Interpolation with Transformer
News
Dependencies
- python >= 3.8
- pytorch >= 1.8.0
- torchvision >= 0.9.0
Prepare Dataset
To train on the Vimeo90K, we have to first compute the ground-truth flows between frames using Lite-flownet, you can clone the Lite-flownet repo and put compute_flow_vimeo.py we provide under its main directory and run (remember to change the data path in these lines, the liteflownet checkpoint in this line can be found here):
python compute_flow_vimeo.py
Get Started
- Clone this repo.
git clone https://github.com/Jia-Research-Lab/VFIformer.git cd VFIformer - Modify the argument
--data_rootintrain.pyaccording to your Vimeo90K path.
Evaluation
-
Download the pre-trained models and place them into the
pretrained_models/folder.- Pre-trained models can be downloaded from Google Drive
- pretrained_VFIformer: the final model in the main paper
- pretrained_VFIformerSmall: the smaller version of the model mentioned in the supplementary file
- Pre-trained models can be downloaded from Google Drive
-
Test on the Vimeo90K testing set.
Modify the argument
--data_rootaccording to your data path, run:python test.py --data_root [your Vimeo90K path] --testset VimeoDataset --net_name VFIformer --resume ./pretrained_models/pretrained_VFIformer/net_220.pth --save_resultIf you want to test with the smaller model, please change the
--net_nameand--resumeaccordingly:python test.py --data_root [your Vimeo90K path] --testset VimeoDataset --net_name VFIformerSmall --resume ./pretrained_models/pretrained_VFIformerSmall/net_220.pth --save_resultThe testing results are saved in the
test_results/folder. If you do not want to save the image results, you can remove the--save_resultargument in the commands optionally. -
Test on the MiddleBury dataset.
Modify the argument
--data_rootaccording to your data path, run:python test.py --data_root [your MiddleBury path] --testset MiddleburyDataset --net_name VFIformer --resume ./pretrained_models/pretrained_VFIformer/net_220.pth --save_result -
Test on the UCF101 dataset.
Modify the argument
--data_rootaccording to your data path, run:python test.py --data_root [your UCF101 path] --testset UFC101Dataset --net_name VFIformer --resume ./pretrained_models/pretrained_VFIformer/net_220.pth --save_result -
Test on the SNU-FILM dataset.
Modify the argument
--data_rootaccording to your data path. Choose the motion level and modify the argument--test_levelaccordingly, run:python FILM_test.py --data_root [your SNU-FILM path] --test_level [easy/medium/hard/extreme] --net_name VFIformer --resume ./pretrained_models/pretrained_VFIformer/net_220.pth
Training
- First train the flow estimator. (Note that skipping this step will not cause a significant impact on performance. We keep this step here only to be consistent with our paper.)
python -m torch.distributed.launch --nproc_per_node=4 --master_port=4174 train.py --launcher pytorch --gpu_ids 0,1,2,3 \ --loss_flow --use_tb_logger --batch_size 48 --net_name IFNet --name train_IFNet --max_iter 300 --crop_size 192 --save_epoch_freq 5 - Then train the whole framework.
python -m torch.distributed.launch --nproc_per_node=8 --master_port=4175 train.py --launcher pytorch --gpu_ids 0,1,2,3,4,5,6,7 \ --loss_l1 --loss_ter --loss_flow --use_tb_logger --batch_size 24 --net_name VFIformer --name train_VFIformer --max_iter 300 \ --crop_size 192 --save_epoch_freq 5 --resume_flownet ./weights/train_IFNet/snapshot/net_final.pth - To train the smaller version, run:
python -m torch.distributed.launch --nproc_per_node=8 --master_port=4175 train.py --launcher pytorch --gpu_ids 0,1,2,3,4,5,6,7 \ --loss_l1 --loss_ter --loss_flow --use_tb_logger --batch_size 24 --net_name VFIformerSmall --name train_VFIformerSmall --max_iter 300 \ --crop_size 192 --save_epoch_freq 5 --resume_flownet ./weights/train_IFNet/snapshot/net_final.pth
Test on your own data
- Modify the arguments
--img0_pathand--img1_pathaccording to your data path, run:python demo.py --img0_path [your img0 path] --img1_path [your img1 path] --save_folder [your save path] --net_name VFIformer --resume ./pretrained_models/pretrained_VFIformer/net_220.pth
Acknowledgement
We borrow some codes from RIFE and SwinIR. We thank the authors for their great work.
Citation
Please consider citing our paper in your publications if it is useful for your research.
@inproceedings{lu2022vfiformer,
title={Video Frame Interpolation with Transformer},
author={Liying Lu, Ruizheng Wu, Huaijia Lin, Jiangbo Lu, and Jiaya Jia},
booktitle={IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
year={2022},
}
Contact
lylu@cse.cuhk.edu.hk
Related Skills
qqbot-channel
349.7kQQ 频道管理技能。查询频道列表、子频道、成员、发帖、公告、日程等操作。使用 qqbot_channel_api 工具代理 QQ 开放平台 HTTP 接口,自动处理 Token 鉴权。当用户需要查看频道、管理子频道、查询成员、发布帖子/公告/日程时使用。
docs-writer
100.4k`docs-writer` skill instructions As an expert technical writer and editor for the Gemini CLI project, you produce accurate, clear, and consistent documentation. When asked to write, edit, or revie
model-usage
349.7kUse CodexBar CLI local cost usage to summarize per-model usage for Codex or Claude, including the current (most recent) model or a full model breakdown. Trigger when asked for model-level usage/cost data from codexbar, or when you need a scriptable per-model summary from codexbar cost JSON.
Design
Campus Second-Hand Trading Platform \- General Design Document (v5.0 \- React Architecture \- Complete Final Version)1\. System Overall Design 1.1. Project Overview This project aims t
