SAFA
Official Pytorch Implementation of 3DV2021 paper: SAFA: Structure Aware Face Animation.
Install / Use
/learn @Qiulin-W/SAFAREADME
SAFA: Structure Aware Face Animation (3DV2021)
Official Pytorch Implementation of 3DV2021 paper: SAFA: Structure Aware Face Animation.


Getting Started
git clone https://github.com/Qiulin-W/SAFA.git
Installation
Python 3.6 or higher is recommended.
1. Install PyTorch3D
Follow the guidance from: https://github.com/facebookresearch/pytorch3d/blob/master/INSTALL.md.
2. Install Other Dependencies
To install other dependencies run:
pip install -r requirements.txt
Usage
1. Preparation
a. Download FLAME model, choose FLAME 2020 and unzip it, put generic_model.pkl under ./modules/data.
b. Download head_template.obj, landmark_embedding.npy, uv_face_eye_mask.png and uv_face_mask.png from DECA/data, and put them under ./module/data.
c. Download SAFA model checkpoint from Google Drive and put it under ./ckpt.
d. (Optional, required by the face swap demo) Download the pretrained face parser from face-parsing.PyTorch and put it under ./face_parsing/cp.
2. Demos
We provide demos for animation and face swap.
a. Animation demo
python animation_demo.py --config config/end2end.yaml --checkpoint path/to/checkpoint --source_image_pth path/to/source_image --driving_video_pth path/to/driving_video --relative --adapt_scale --find_best_frame
b. Face swap demo We adopt face-parsing.PyTorch for indicating the face regions in both the source and driving images.
For preprocessed source images and driving videos, run:
python face_swap_demo.py --config config/end2end.yaml --checkpoint path/to/checkpoint --source_image_pth path/to/source_image --driving_video_pth path/to/driving_video
For arbitrary images and videos, we use a face detector to detect and swap the corresponding face parts. Cropped images will be resized to 256*256 in order to fit to our model.
python face_swap_demo.py --config config/end2end.yaml --checkpoint path/to/checkpoint --source_image_pth path/to/source_image --driving_video_pth path/to/driving_video --use_detection
Training
We modify the distributed traininig framework used in that of the First Order Motion Model. Instead of using torch.nn.DataParallel (DP), we adopt torch.distributed.DistributedDataParallel (DDP) for faster training and more balanced GPU memory load. The training procedure is divided into two steps: (1) Pretrain the 3DMM estimator, (2) End-to-end Training.
3DMM Estimator Pre-training
CUDA_VISIBLE_DEVICES="0,1,2,3" python -m torch.distributed.launch --nproc_per_node 4 run_ddp.py --config config/pretrain.yaml
End-to-end Training
CUDA_VISIBLE_DEVICES="0,1,2,3" python -m torch.distributed.launch --nproc_per_node 4 run_ddp.py --config config/end2end.yaml --tdmm_checkpoint path/to/tdmm_checkpoint_pth
Evaluation / Inference
Video Reconstrucion
python run_ddp.py --config config/end2end.yaml --checkpoint path/to/checkpoint --mode reconstruction
Image Animation
python run_ddp.py --config config/end2end.yaml --checkpoint path/to/checkpoint --mode animation
3D Face Reconstruction
python tdmm_inference.py --data_dir directory/to/images --tdmm_checkpoint path/to/tdmm_checkpoint_pth
Dataset and Preprocessing
We use VoxCeleb1 to train and evaluate our model. Original Youtube videos are downloaded, cropped and splited following the instructions from video-preprocessing.
a. To obtain the facial landmark meta data from the preprocessed videos, run:
python video_ldmk_meta.py --video_dir directory/to/preprocessed_videos out_dir directory/to/output_meta_files
b. (Optional) Extract images from videos for 3DMM pretraining:
python extract_imgs.py
Citation
If you find our work useful to your research, please consider citing:
@article{wang2021safa,
title={SAFA: Structure Aware Face Animation},
author={Wang, Qiulin and Zhang, Lu and Li, Bo},
journal={arXiv preprint arXiv:2111.04928},
year={2021}
}
License
Please refer to the LICENSE file.
Acknowledgement
Here we provide the list of external sources that we use or adapt from:
- Codes are heavily borrowed from First Order Motion Model, LICENSE.
- Some codes are also borrowed from: a. FLAME_PyTorch, LICENSE b. generative-inpainting-pytorch, LICENSE c. face-parsing.PyTorch, LICENSE d. video-preprocessing.
- We adopt FLAME model resources from: a. DECA, LICENSE b. FLAME, LICENSE
- External Libaraies: a. PyTorch3D, LICENSE b. face-alignment, LICENSE
Related Skills
qqbot-channel
343.1kQQ 频道管理技能。查询频道列表、子频道、成员、发帖、公告、日程等操作。使用 qqbot_channel_api 工具代理 QQ 开放平台 HTTP 接口,自动处理 Token 鉴权。当用户需要查看频道、管理子频道、查询成员、发布帖子/公告/日程时使用。
docs-writer
99.7k`docs-writer` skill instructions As an expert technical writer and editor for the Gemini CLI project, you produce accurate, clear, and consistent documentation. When asked to write, edit, or revie
model-usage
343.1kUse CodexBar CLI local cost usage to summarize per-model usage for Codex or Claude, including the current (most recent) model or a full model breakdown. Trigger when asked for model-level usage/cost data from codexbar, or when you need a scriptable per-model summary from codexbar cost JSON.
ddd
Guía de Principios DDD para el Proyecto > 📚 Documento Complementario : Este documento define los principios y reglas de DDD. Para ver templates de código, ejemplos detallados y guías paso
