OTAvatar
[CVPR2023] OTAvatar: One-shot Talking Face Avatar with Controllable Tri-plane Rendering.
Install / Use
/learn @theEricMa/OTAvatarREADME
OTAvatar : One-shot Talking Face Avatar with Controllable Tri-plane Rendering
Paper | Demo
Update
April.30: The model weight is released. The dataset is also available in Google Drive, see below for detail.
April.4: The preprocessed dataset is released, please see the Data preparation section. Some missing files are also uploaded.
Get started
Environment Setup
git clone git@github.com:theEricMa/OTAvatar.git
cd OTAvatar
conda env create -f environment.yml
conda activate otavatar
Pre-trained Models
Download and copy EG3D FFHQ model ffhqrebalanced512-64.pth [Baidu Netdisk][Google Drive] to the pretrained directory. It is the ffhqrebalanced512-64.pkl file obtained from webpage, and converted to .pth format using the pkl2pth script.
Download arcface_resnet18.pth and save to the pretrained directory.
Data preparation
We upload the processed dataset hdtf_lmdb_inv in [Baidu Netdisk][Google Drive]. In the root directory,
mkdir datasets
mv <your hdtf_lmdb_inv path> datasets/
Generally the processing scripts is a mixture of that in PIRenderer and ADNeRF. We plan to further open a new repo to upload our revised preocessing script.
Face Animation
Create the folder result/otavatarif it does not exist. Place the model downloaded from [Baidu Netdisk][Google Drive] under this directory. Run,
export CUDA_VISIBLE_DEVICES=0
python -m torch.distributed.launch --nproc_per_node=1 --master_port 12345 inference_refine_1D_cam.py \
--config ./config/otavatar.yaml \
--name otavatar \
--no_resume \
--which_iter 2000 \
--image_size 512 \
--ws_plus \
--cross_id \
--cross_id_target WRA_EricCantor_000 \
--output_dir ./result/otavatar/evaluation/cross_ws_plus_WRA_EricCantor_000
To animate each identity given the motion from WRA_EricCantor_000.
Or simply run,
sh scripts/inference.sh
Start Training
Run,
export CUDA_VISIBLE_DEVICES=0,1,2,3
python -m torch.distributed.launch --nproc_per_node=4 --master_port 12346 train_inversion.py \
--config ./config/otavatar.yaml \
--name otavatar
Or simply run,
sh scripts/train.sh
Acknowledgement
We appreciate the model or code from EG3D, PIRenderer, StyleHEAT, EG3D-projector.
Citation
If you find this work helpful, please cite:
@article{ma2023otavatar,
title={OTAvatar: One-shot Talking Face Avatar with Controllable Tri-plane Rendering},
author={Ma, Zhiyuan and Zhu, Xiangyu and Qi, Guojun and Lei, Zhen and Zhang, Lei},
journal={arXiv preprint arXiv:2303.14662},
year={2023}
}
Related Skills
YC-Killer
2.7kA library of enterprise-grade AI agents designed to democratize artificial intelligence and provide free, open-source alternatives to overvalued Y Combinator startups. If you are excited about democratizing AI access & AI agents, please star ⭐️ this repository and use the link in the readme to join our open source AI research team.
openclaw-plugin-loom
Loom Learning Graph Skill This skill guides agents on how to use the Loom plugin to build and expand a learning graph over time. Purpose - Help users navigate learning paths (e.g., Nix, German)
best-practices-researcher
The most comprehensive Claude Code skills registry | Web Search: https://skills-registry-web.vercel.app
groundhog
398Groundhog's primary purpose is to teach people how Cursor and all these other coding agents work under the hood. If you understand how these coding assistants work from first principles, then you can drive these tools harder (or perhaps make your own!).
