SkillAgentSearch skills...

HeadStudio

[ECCV 2024] HeadStudio: Text to Animatable Head Avatars with 3D Gaussian Splatting.

Install / Use

/learn @ZhenglinZhou/HeadStudio
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

<div align="center"> <h1>HeadStudio: Text to Animatable Head Avatars with 3D Gaussian Splatting</h1>

Zhenglin Zhou · Fan Ma · Hehe Fan · Zongxin Yang · Yi Yang<sup>*</sup>

ReLER, CCAI, Zhejiang University

<sup>*</sup>corresponding authors

<a href='https://zhenglinzhou.github.io/HeadStudio-ProjectPage/'><img src='https://img.shields.io/badge/Project-Page-green'></a> <a href='https://www.ecva.net/papers/eccv_2024/papers_ECCV/papers/04681.pdf'><img src='https://img.shields.io/badge/Technique-Report-red'></a> GitHub

https://github.com/ZhenglinZhou/HeadStudio/assets/42434623/19893d52-8fe5-473d-b5c0-aea29d6be21a

</div>

Text to Head Avatars Generation

<p align="center"> <img src="./assets/teaser.png"> </p>

Text-based animatable avatars generation by HeadStudio.

Installation

All the followings have been tested successfully in cuda 11.8.

# clone the github repo
git clone https://github.com/zhenglinzhou/HeadStudio-open.git
cd HeadStudio-open

Create a conda environment:

# make a new conda env (optional)
conda create -n headstudio python=3.9
conda activate headstudio

It may take some time to install:

# install necessary packages
pip install torch==2.0.1+cu118 torchvision==0.15.2+cu118 torchaudio==2.0.2 --index-url https://download.pytorch.org/whl/cu118

# install some packages using conda
bash packages.sh

# install packages using pip
pip install -r requirements.txt

# a modified gaussian splatting (+ depth, alpha rendering)
git clone --recursive https://github.com/ashawkey/diff-gaussian-rasterization
pip install ./diff-gaussian-rasterization
  • HeadStudio is built on the FLAME. Before you continue, please kindly register and agree to the license from https://flame.is.tue.mpg.de.
  • Download FLAME 2020 which contains FLAME_FEMALE.pkl, FLAME_GENERIC.pkl, FLAME_MAKE.pkl from https://flame.is.tue.mpg.de.
  • Download other ckpts and training/validation files from here.
  • Make the folder like this:
.
|-ckpts
    |-ControlNet-Mediapipe
        |-flame2facemsh.npy
        |-mediapipe_landmark_embedding.npz
    |-FLAME-2000
        |-FLAME_FEMALE.pkl
        |-FLAME_GENERIC.pkl
        |-FLAME_MAKE.pkl
        |-flame_static_embeddings.pkl
        |-flame_dynamic_embeddings.pkl
|-talkshow
    # for training with animation
    |-collection
        |-cemistry_exp.npy
    # for evaluation
    |-ExpressiveWholeBodyDatasetReleaseV1.0
...
  • Specify the talkshow_train_path and talkshow_val_path in ./configs/headstudio.yaml.

Usage

python3 launch.py \
--config configs/headstudio.yaml --train system.prompt_processor.prompt='a DSLR portrait of Joker in DC, masterpiece, Studio Quality, 8k, ultra-HD, next generation' \
system.guidance.use_nfsd=True system.max_grad=0.001 system.area_relax=True

More examples can be found in ./scripts/headstudio.sh

Prepare Animation Data

  1. Install TalkSHOW. You had better use another python environment for following animation, since TalkSHOW needs python 3.7.

please remember to install torchaudio~=0.13.1, torchvision~=0.14.1.

  1. Download SHOW_dataset_v1.0.zip following this.

Animation

Video-based Animation

Animate the avatar using .pkl file captured from video clip (SHOW_dataset_v1.0.zip).

python3 animation.py

Audio-based Animation

  • Copy the ./scripts/demo.py into TalkSHOW folder.
  • Specify the save_root in demo.py.
  • Given an audio clip, generate FLAME sequences via TalkSHOW as below, please specify path-to-wav-file.
cd TalkSHOW
python3 demo.py \
--config_file ./config/body_pixel.json --infer --audio_file path-to-wav-file \
--id 0 --only_face
  • Animate avatars using generated FLAME sequences via TalkSHOW.
python3 animation_TalkSHOW.py --audio path-to-audio --avatar path-to-avatar

Text-based Animation

  • Generate the audio with given text using PlayHT.
  • Transfer to audio-based animation.

Acknowledgements

  • HeadStudio is developed by ReLER at Zhejiang University, all copyright reserved.
  • Thanks Duochao and Xuancheng to fix bugs and further develop this work.
  • Thanks PlayHT, we use it for text to audio generation.
  • Thanks TalkSHOW, we use it for audio-based avatar driven.
  • Thanks threestudio, GaussianAvatars, HumanGaussian, TADA, this work is built on these amazing research works.

Notes

  • If you have questions or find bugs, feel free to open an issue or email the first author (zhenglinzhou@zju.edu.cn)!
  • If you encounter RuntimeError: an illegal memory access was encountered or numel: integer multiplication overflow errors during rasterization, try to reinstall diff-gaussian-rasterization with -fno-gnu-unique flag. For more details look here

Cite

If you find HeadStudio useful for your research and applications, please cite us using this BibTeX:

@inproceedings{zhou2024headstudio,
  title = {HeadStudio: Text to Animatable Head Avatars with 3D Gaussian Splatting},
  author = {Zhenglin Zhou and Fan Ma and Hehe Fan and Zongxin Yang and Yi Yang},
  booktitle = {ECCV},
  year={2024},
}
View on GitHub
GitHub Stars216
CategoryDevelopment
Updated4d ago
Forks11

Languages

Python

Security Score

95/100

Audited on Mar 23, 2026

No findings