<div align="center"> <img src='https://user-images.githubusercontent.com/4397546/229094115-862c747e-7397-4b54-ba4a-bd368bfe2e0f.png' width='500px'/>

<div> <a target='_blank'>Wenxuan Zhang <sup>*,1,2</sup> </a>&emsp; <a href='https://vinthony.github.io/' target='_blank'>Xiaodong Cun <sup>*,2</a>&emsp; <a href='https://xuanwangvc.github.io/' target='_blank'>Xuan Wang <sup>3</sup></a>&emsp; <a href='https://yzhang2016.github.io/' target='_blank'>Yong Zhang <sup>2</sup></a>&emsp; <a href='https://xishen0220.github.io/' target='_blank'>Xi Shen <sup>2</sup></a>&emsp; </br> <a href='https://yuguo-xjtu.github.io/' target='_blank'>Yu Guo<sup>1</sup> </a>&emsp; <a href='https://scholar.google.com/citations?hl=zh-CN&user=4oXBp9UAAAAJ' target='_blank'>Ying Shan <sup>2</sup> </a>&emsp; <a target='_blank'>Fei Wang <sup>1</sup> </a>&emsp; </div> <br> <div> <sup>1</sup> Xi'an Jiaotong University &emsp; <sup>2</sup> Tencent AI Lab &emsp; <sup>3</sup> Ant Group &emsp; </div> <br> <i><strong><a href='https://arxiv.org/abs/2211.12194' target='_blank'>CVPR 2023</a></strong></i> <br> <br>

sadtalker

<b>TL;DR: single portrait image 🙎‍♂️ + audio 🎤 = talking head video 🎞.</b>

Highlights

The license has been updated to Apache 2.0, and we've removed the non-commercial restriction
SadTalker has now officially been integrated into Discord, where you can use it for free by sending files. You can also generate high-quailty videos from text prompts. Join:
We've published a stable-diffusion-webui extension. Check out more details here. Demo Video
Full image mode is now available! More details...

| still+enhancer in v0.0.1 | still + enhancer in v0.0.2 | input image @bagbag1815 | |:--------------------: |:--------------------: | :----: | | <video src="https://user-images.githubusercontent.com/48216707/229484996-5d7be64f-2553-4c9e-a452-c5cf0b8ebafe.mp4" type="video/mp4"> </video> | <video src="https://user-images.githubusercontent.com/4397546/230717873-355b7bf3-d3de-49f9-a439-9220e623fce7.mp4" type="video/mp4"> </video> | <img src='./examples/source_image/full_body_2.png' width='380'>

Several new modes (Still, reference, and resize modes) are now available!
We're happy to see more community demos on bilibili, YouTube and X (#sadtalker).

Changelog

The previous changelog can be found here.

[2023.06.12]: Added more new features in WebUI extension, see the discussion here.
[2023.06.05]: Released a new 512x512px (beta) face model. Fixed some bugs and improve the performance.
[2023.04.15]: Added a WebUI Colab notebook by @camenduru:
[2023.04.12]: Added a more detailed WebUI installation document and fixed a problem when reinstalling.
[2023.04.12]: Fixed the WebUI safe issues becasue of 3rd-party packages, and optimized the output path in sd-webui-extension.
[2023.04.08]: In v0.0.2, we added a logo watermark to the generated video to prevent abuse. This watermark has since been removed in a later release.
[2023.04.08]: In v0.0.2, we added features for full image animation and a link to download checkpoints from Baidu. We also optimized the enhancer logic.

To-Do

We're tracking new updates in issue #280.

Troubleshooting

If you have any problems, please read our FAQs before opening an issue.

1. Installation.

Community tutorials: 中文Windows教程 (Chinese Windows tutorial) | 日本語コース (Japanese tutorial).

Linux/Unix

Install Anaconda, Python and git.
Creating the env and install the requirements.

git clone https://github.com/OpenTalker/SadTalker.git

cd SadTalker 

conda create -n sadtalker python=3.8

conda activate sadtalker

pip install torch==1.12.1+cu113 torchvision==0.13.1+cu113 torchaudio==0.12.1 --extra-index-url https://download.pytorch.org/whl/cu113

conda install ffmpeg

pip install -r requirements.txt

### Coqui TTS is optional for gradio demo. 
### pip install TTS

Windows

A video tutorial in chinese is available here. You can also follow the following instructions:

Install Python 3.8 and check "Add Python to PATH".
Install git manually or using Scoop: scoop install git.
Install ffmpeg, following this tutorial or using scoop: scoop install ffmpeg.
Download the SadTalker repository by running git clone https://github.com/Winfredy/SadTalker.git.
Download the checkpoints and gfpgan models in the downloads section.
Run start.bat from Windows Explorer as normal, non-administrator, user, and a Gradio-powered WebUI demo will be started.

macOS

A tutorial on installing SadTalker on macOS can be found here.

Docker, WSL, etc

Please check out additional tutorials here.

2. Download Models

You can run the following script on Linux/macOS to automatically download all the models:

bash scripts/download_models.sh

We also provide an offline patch (gfpgan/), so no model will be downloaded when generating.

Pre-Trained Models

GFPGAN Offline Patch

<details><summary>Model Details</summary>

Model explains:

New version

Old version

SadTalker

Install / Use

README