TADA
[3DV 2024] Official Repository for "TADA! Text to Animatable Digital Avatars".
Install / Use
/learn @TingtingLiao/TADAREADME
TADA takes text as input and produce holistic animatable 3D avatars with high-quality geometry and texture. It enables creation of large-scale digital character assets that are ready for animation and rendering, while also being easily editable through natural language.
NEWS (2023.9.24):
- Using Omnidata normal prediction model to improve the normal&image consistency.
https://github.com/TingtingLiao/TADA/assets/45743512/f626af25-3c5c-4ab5-bbe6-7a85d95af913
https://github.com/TingtingLiao/TADA/assets/45743512/442d6617-3549-48cc-9868-e4fe0c4ba842
Install
- System requirement: Unbuntu 20.04
- Tested GPUs: RTX4090, A100, V100
- Compiler: gcc-7.5 / g++-7.5
- Python=3.9, CUDA=11.5, Pytorch=1.12.1
git clone git@github.com:TingtingLiao/TADA.git
cd TADA
conda env create --file environment.yml
conda activate tada
pip install -r requirements.txt
cd smplx
python setup.py install
# download omnidata normal and depth prediction model
mkdir data/omnidata
cd data/omnidata
gdown '1Jrh-bRnJEjyMCS7f-WsaFlccfPjJPPHI&confirm=t' # omnidata_dpt_depth_v2.ckpt
gdown '1wNxVO4vVbDEMEpnAi_jwQObf2MFodcBR&confirm=t' # omnidata_dpt_normal_v2.ckpt
Data
- SMPL-X Model (Need to register, download the SMPLX_NEUTRAL_2020.npz and put it into ./data/smplx/)
- TADA Extra Data (Required) Unzip it as directory ./data
- TADA 100 Characters (Optional)
- Optional Motion Data
@inproceedings{aist-dance-db,
author = {Shuhei Tsuchida and Satoru Fukayama and Masahiro Hamasaki and Masataka Goto},
title = {AIST Dance Video Database: Multi-genre, Multi-dancer, and Multi-camera Database for Dance Information Processing},
booktitle = {Proceedings of the 20th International Society for Music Information Retrieval Conference (ISMIR) },
year = {2019},
month = {Nov}
}
@inproceedings{li2021learn,
title={AI Choreographer: Music Conditioned 3D Dance Generation with AIST++},
author={Ruilong Li and Shan Yang and David A. Ross and Angjoo Kanazawa},
year={2021},
booktitle={ICCV}
}
@inproceedings{yi2023generating,
title={Generating Holistic 3D Human Motion from Speech},
author={Yi, Hongwei and Liang, Hualin and Liu, Yifei and Cao, Qiong and Wen, Yandong and Bolkart, Timo and Tao, Dacheng and Black Michael J},
booktitle={CVPR},
pages={469-480},
month={June},
year={2023}
}
@inproceedings{tevet2023human,
title={Human Motion Diffusion Model},
author={Guy Tevet and Sigal Raab and Brian Gordon and Yoni Shafir and Daniel Cohen-or and Amit Haim Bermano},
booktitle={ICLR},
year={2023},
url={https://openreview.net/forum?id=SJ1kSyO2jwu}
}
</details>
Usage
Training
The results will be saved in $workspace. Please change it in the config/*.yaml files.
# single prompt training
python -m apps.run --config configs/tada.yaml --text "Aladdin in Aladdin"
# with Omnidata supervision
python -m apps.run --config configs/tada_w_dpt.yaml --text "Aladdin in Aladdin"
# multiple prompts training
bash scripts/run.sh data/prompt/fictional.txt 1 10 configs/tada.yaml
Animation
- Download AIST or generate motions from TalkShow and MotionDiffusion.
python -m apps.anime --subject "Abraham Lincoln" --res_dir your_result_path
Tips
- Using an appropriate learning rate for SMPL-X shape is important to learn accurate shape.
- Omnidata normal supervision can effectively enhance the overall geometry and texture consistency; however, it demands more time for optimization.
Citation
@inproceedings{liao2024tada,
title={{TADA! Text to Animatable Digital Avatars}},
author={Liao, Tingting and Yi, Hongwei and Xiu, Yuliang and Tang, Jiaxiang and Huang, Yangyi and Thies, Justus and Black, Michael J.},
booktitle={International Conference on 3D Vision (3DV)},
year={2024}
}
Related Works
- HumanNorm: multiple stage SDS loss and perceptual loss can help generate the lifelike texture.
- SemanticBoost: uses TADA's rigged avatars to demonstrate the generated motions.
- SignAvatars: uses TADA's rigged avatars to demonstrate the sign language data.
- GALA: uses TADA's avatars for asset generation.
Contributors
<a href="https://github.com/TingtingLiao/TADA/graphs/contributors"> <img src="https://contrib.rocks/image?repo=TingtingLiao/TADA" /> </a>License
This code and model are available for non-commercial scientific research purposes as defined in the LICENSE (i.e., MIT LICENSE). Note that, using TADA, you have to register SMPL-X and agree with the LICENSE of it, and it's not MIT LICENSE, you can check the LICENSE of SMPL-X from https://github.com/vchoutas/smplx/blob/main/LICENSE; Enjoy your journey of exploring more beautiful avatars in your own application.
Related Skills
node-connect
339.3kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
83.9kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
339.3kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
commit-push-pr
83.9kCommit, push, and open a PR
