FeatureFlow
A state-of-the-art Video Frame Interpolation Method using feature flows blending. (CVPR 2020)
Install / Use
/learn @CM-BF/FeatureFlowREADME
FeatureFlow
A state-of-the-art Video Frame Interpolation Method using deep semantic flows blending.
FeatureFlow: Robust Video Interpolation via Structure-to-texture Generation (IEEE Conference on Computer Vision and Pattern Recognition 2020)
To Do List
- [x] Preprint
- [x] Training code
Table of Contents
- Requirements
- Demos
- Installation
- Pre-trained Model
- Download Results
- Evaluation
- Test your video
- Training
- Citation
Requirements
- Ubuntu
- PyTorch (>=1.1)
- Cuda (>=10.0) & Cudnn (>=7.0)
- mmdet 1.0rc (from https://github.com/open-mmlab/mmdetection.git)
- visdom (not necessary)
- NVIDIA GPU
Ps: requirements.txt is provided, but do not use it directly. It is just for reference because it contains another project's dependencies.
Video demos
Click the picture to Download one of them or click Here(Google) or Here(Baidu)(key: oav2) to download 360p demos.
360p demos(including comparisons):
<img width="320" height="180" src="https://github.com/CM-BF/FeatureFlow/blob/master/data/figures/youtube.png"/> <img width="320" height="180" src="https://github.com/CM-BF/FeatureFlow/blob/master/data/figures/check_all.png"/> <img width="320" height="180" src="https://github.com/CM-BF/FeatureFlow/blob/master/data/figures/tianqi_all.png"/> <img width="320" height="180" src="https://github.com/CM-BF/FeatureFlow/blob/master/data/figures/video.png"/> <img width="320" height="180" src="https://github.com/CM-BF/FeatureFlow/blob/master/data/figures/shana.png"/>
720p demos:
<img width="320" height="180" src="https://github.com/CM-BF/FeatureFlow/blob/master/data/figures/SYA_1.png"/> <img width="320" height="180" src="https://github.com/CM-BF/FeatureFlow/blob/master/data/figures/SYA_2.png"/>
Installation
- clone this repo
- git clone https://github.com/open-mmlab/mmdetection.git
- install mmdetection: please follow the guidence in its github
$ cd mmdetection
$ pip install -r requirements/build.txt
$ pip install "git+https://github.com/cocodataset/cocoapi.git#subdirectory=PythonAPI"
$ pip install -v -e . # or "python setup.py develop"
$ pip list | grep mmdet
- Download test set
$ unzip vimeo_interp_test.zip
$ cd vimeo_interp_test
$ mkdir sequences
$ cp target/* sequences/ -r
$ cp input/* sequences/ -r
- Download BDCN's pre-trained model:bdcn_pretrained_on_bsds500.pth to ./model/bdcn/final-model/
Ps: For your convenience, you can only download the bdcn_pretrained_on_bsds500.pth: Google Drive or all of the pre-trained bdcn models its authors provided: Google Drive. For a Baidu Cloud link, you can resort to BDCN's GitHub repository.
$ pip install scikit-image visdom tqdm prefetch-generator
Pre-trained Model
Baidu Cloud: ae4x
Place FeFlow.ckpt to ./checkpoints/.
Download Results
Baidu Cloud: pc0k
Evaluation
$ CUDA_VISIBLE_DEVICES=0 python eval_Vimeo90K.py --checkpoint ./checkpoints/FeFlow.ckpt --dataset_root ~/datasets/videos/vimeo_interp_test --visdom_env test --vimeo90k --imgpath ./results/
Test your video
$ CUDA_VISIBLE_DEVICES=0 python sequence_run.py --checkpoint checkpoints/FeFlow.ckpt --video_path ./yourvideo.mp4 --t_interp 4 --slow_motion
--t_interp sets frame multiples, only power of 2(2,4,8...) are supported. Use flag --slow_motion to slow down the video which maintains the original fps.
The output video will be saved as output.mp4 in your working diractory.
Training
Training Code train.py is available now. I can't run it for comfirmation now because I've left the Lab, but I'm sure it will work with right argument settings.
$ CUDA_VISIBLE_DEVICES=0,1 python train.py <arguments>
- Please read the arguments' help carefully to fully control the two-step training.
- Pay attention to the
--GEN_DEwhich is the flag to set the model to Stage-I or Stage-II. - 2 GPUs is necessary for training or the small batch_size will cause training process crash.
- Deformable CNN is not stable enough so that you may face training crash sometimes(I didn't fix the random seed), but it can be detected soon after the beginning of running by visualizing results using Visdom.
- Visdom visualization codes[line 75, 201-216 and 338-353] are included which is good for viewing training process and checking crash.
Citation
@InProceedings{Gui_2020_CVPR,
author = {Gui, Shurui and Wang, Chaoyue and Chen, Qihua and Tao, Dacheng},
title = {FeatureFlow: Robust Video Interpolation via Structure-to-Texture Generation},
booktitle = {The IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2020}
}
Contact
License
See MIT License
Related Skills
qqbot-channel
345.9kQQ 频道管理技能。查询频道列表、子频道、成员、发帖、公告、日程等操作。使用 qqbot_channel_api 工具代理 QQ 开放平台 HTTP 接口,自动处理 Token 鉴权。当用户需要查看频道、管理子频道、查询成员、发布帖子/公告/日程时使用。
docs-writer
100.0k`docs-writer` skill instructions As an expert technical writer and editor for the Gemini CLI project, you produce accurate, clear, and consistent documentation. When asked to write, edit, or revie
model-usage
345.9kUse CodexBar CLI local cost usage to summarize per-model usage for Codex or Claude, including the current (most recent) model or a full model breakdown. Trigger when asked for model-level usage/cost data from codexbar, or when you need a scriptable per-model summary from codexbar cost JSON.
arscontexta
2.9kClaude Code plugin that generates individualized knowledge systems from conversation. You describe how you think and work, have a conversation and get a complete second brain as markdown files you own.
