<div align="center"> <h1>HeadStudio: Text to Animatable Head Avatars with 3D Gaussian Splatting</h1>

Zhenglin Zhou · Fan Ma · Hehe Fan · Zongxin Yang · Yi Yang<sup>*</sup>

ReLER, CCAI, Zhejiang University

<sup>*</sup>corresponding authors

https://github.com/ZhenglinZhou/HeadStudio/assets/42434623/19893d52-8fe5-473d-b5c0-aea29d6be21a

</div>

Text to Head Avatars Generation

Text-based animatable avatars generation by HeadStudio.

Installation

All the followings have been tested successfully in cuda 11.8.

# clone the github repo
git clone https://github.com/zhenglinzhou/HeadStudio-open.git
cd HeadStudio-open

Create a conda environment:

# make a new conda env (optional)
conda create -n headstudio python=3.9
conda activate headstudio

It may take some time to install:

# install necessary packages
pip install torch==2.0.1+cu118 torchvision==0.15.2+cu118 torchaudio==2.0.2 --index-url https://download.pytorch.org/whl/cu118

# install some packages using conda
bash packages.sh

# install packages using pip
pip install -r requirements.txt

# a modified gaussian splatting (+ depth, alpha rendering)
git clone --recursive https://github.com/ashawkey/diff-gaussian-rasterization
pip install ./diff-gaussian-rasterization

HeadStudio is built on the FLAME. Before you continue, please kindly register and agree to the license from https://flame.is.tue.mpg.de.
Download FLAME 2020 which contains FLAME_FEMALE.pkl, FLAME_GENERIC.pkl, FLAME_MAKE.pkl from https://flame.is.tue.mpg.de.
Download other ckpts and training/validation files from here.
Make the folder like this:

.
|-ckpts
    |-ControlNet-Mediapipe
        |-flame2facemsh.npy
        |-mediapipe_landmark_embedding.npz
    |-FLAME-2000
        |-FLAME_FEMALE.pkl
        |-FLAME_GENERIC.pkl
        |-FLAME_MAKE.pkl
        |-flame_static_embeddings.pkl
        |-flame_dynamic_embeddings.pkl
|-talkshow
    # for training with animation
    |-collection
        |-cemistry_exp.npy
    # for evaluation
    |-ExpressiveWholeBodyDatasetReleaseV1.0
...

Specify the talkshow_train_path and talkshow_val_path in ./configs/headstudio.yaml.

Usage

python3 launch.py \
--config configs/headstudio.yaml --train system.prompt_processor.prompt='a DSLR portrait of Joker in DC, masterpiece, Studio Quality, 8k, ultra-HD, next generation' \
system.guidance.use_nfsd=True system.max_grad=0.001 system.area_relax=True

More examples can be found in ./scripts/headstudio.sh

Prepare Animation Data

Install TalkSHOW. You had better use another python environment for following animation, since TalkSHOW needs python 3.7.

please remember to install torchaudio~=0.13.1, torchvision~=0.14.1.

Download SHOW_dataset_v1.0.zip following this.

Animation

Video-based Animation

Animate the avatar using .pkl file captured from video clip (SHOW_dataset_v1.0.zip).

python3 animation.py

Audio-based Animation

Copy the ./scripts/demo.py into TalkSHOW folder.
Specify the save_root in demo.py.
Given an audio clip, generate FLAME sequences via TalkSHOW as below, please specify path-to-wav-file.

cd TalkSHOW
python3 demo.py \
--config_file ./config/body_pixel.json --infer --audio_file path-to-wav-file \
--id 0 --only_face

Animate avatars using generated FLAME sequences via TalkSHOW.

python3 animation_TalkSHOW.py --audio path-to-audio --avatar path-to-avatar

Text-based Animation

Generate the audio with given text using PlayHT.
Transfer to audio-based animation.

Acknowledgements

HeadStudio is developed by ReLER at Zhejiang University, all copyright reserved.
Thanks Duochao and Xuancheng to fix bugs and further develop this work.
Thanks PlayHT, we use it for text to audio generation.
Thanks TalkSHOW, we use it for audio-based avatar driven.
Thanks threestudio, GaussianAvatars, HumanGaussian, TADA, this work is built on these amazing research works.

Notes

If you have questions or find bugs, feel free to open an issue or email the first author (zhenglinzhou@zju.edu.cn)!
If you encounter RuntimeError: an illegal memory access was encountered or numel: integer multiplication overflow errors during rasterization, try to reinstall diff-gaussian-rasterization with -fno-gnu-unique flag. For more details look here

Cite

If you find HeadStudio useful for your research and applications, please cite us using this BibTeX:

@inproceedings{zhou2024headstudio,
  title = {HeadStudio: Text to Animatable Head Avatars with 3D Gaussian Splatting},
  author = {Zhenglin Zhou and Fan Ma and Hehe Fan and Zongxin Yang and Yi Yang},
  booktitle = {ECCV},
  year={2024},
}

HeadStudio

Install / Use

README