DAM

[CVPR 2022] Structure-Aware Motion Transfer with Deformable Anchor Model

Generate Convert Improve

Install / Use

/learn @JialeTao/DAM

About this skill

Quality Score

0/100

README

Structure-Aware Motion Transfer with Deformable Anchor Model

Codes for CVPR 2022 paper Structure-Aware Motion Transfer with Deformable Anchor Model.

Environments

The model are trained on 4 Tesla V100 cards, pytorch vesion 1.6 and 1.8 with python 3.6 are tested fine. Basic installations are given in requiremetns.txt.

pip install -r requirements.txt

Datasets

TaiChiHD,Voxceleb1,FashionVideo,MGIF, all following FOMM. After downloading and pre-processing, the dataset should be placed in the ./data folder or you can change the parameter root_dir in the yaml config file. Note that we save the video dataset with png frames format (for example,./data/taichi-png/train/video-id/frames-id.png), for better training IO performance. All train and test video frames are specified in txt files in the ./data folder.

Checkpoints

Google drive Baiduyun passwd:z4ej

Training

We train the hdam model in two stages. Firstly we train dam, and detect the abnormal keypoints, the indexes of detected abnormal keypoints are written to the hdam config via the ignore_kp_list parameter. We then train hdam model with initialization of dam.

Train DAM

CUDA_VISIBLE_DEVICES=0,1,2,3 python -m torch.distributed.launch --nproc_per_node=4 run.py --config config/dataset-dam.yaml

Train HDAM

CUDA_VISIBLE_DEVICES=0 python equivariance_detection.py --config config/dataset-dam.yaml --config_hdam config/dataset-hdam.yaml --checkpoint path/to/dam/model.pth
CUDA_VISIBLE_DEVICES=0,1,2,3 python -m torch.distributed.launch --nproc_per_node=4 run.py --config config/dataset-hdam.yaml --checkpoint path/to/dam/model.pth

Evaluation

Evaluate video reconstruction with following command, for more metrics, we recommend to see FOMM-Pose-Evaluation.

CUDA_VISIBLE_DEVICES=0 python run.py --mode reconstruction --config path/to/config --checkpoint path/to/model.pth

Demo

To make a demo animation, specify the driving video and source image, the result video will be saved to result.mp4.

python demo.py --config path/to/config --checkpoint path/to/model.pth --driving_video path/to/video.mp4 --source_image path/to/image.png --result_video path/to/result.mp4 --adapt_scale

E-commerce animation demo

We've made some applications in the e-commerce senario, which can be seen in the demo paper Move as You Like.

video video

Citation

@inproceedings{tao2022structure,
title={Structure-Aware Motion Transfer with Deformable Anchor Model},
author={Tao, Jiale and Wang, Biao and Xu, Borun and Ge, Tiezheng and Jiang, Yuning and Li, Wen and Duan, Lixin},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
pages={3637--3646},
year={2022}
}

@inproceedings{xu2021move,
title={Move As You Like: Image Animation in E-Commerce Scenario},
author={Xu, Borun and Wang, Biao and Tao, Jiale and Ge, Tiezheng and Jiang, Yuning and Li, Wen and Duan, Lixin},
booktitle={Proceedings of the 29th ACM International Conference on Multimedia},
pages={2759--2761},
year={2021}
}

Acknowledgements

The implemetation is heavily borrowed from FOMM, we thank the author for the great efforts in this area.

Related Skills

node-connect

350.8k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

frontend-design

110.4k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

openai-whisper-api

350.8k

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

qqbot-media

350.8k

QQBot 富媒体收发能力。使用 <qqmedia> 标签，系统根据文件扩展名自动识别类型（图片/语音/视频/文件）。