<h2>AniCrafter: Customizing Realistic Human-Centric Animation via Avatar-Background Conditioning in Video Diffusion Models </h2> <div> <a href='https://myniuuu.github.io/' target='_blank'>Muyao Niu</a> <sup>1,2</sup>   <a href='https://github.com/ljzycmd' target='_blank'>Mingdeng Cao</a><sup>1</sup>   <a href='https://yifever20002.github.io/' target='_blank'>Yifan Zhan</a><sup>1</sup>   <a href='https://qtzhu.me/' target='_blank'>Qingtian Zhu</a><sup>1</sup>   <a href='https://github.com/mm2319' target='_blank'>Mingze Ma</a><sup>1</sup>   <a href='https://github.com/zhaojiancheng007' target='_blank'>Jiancheng Zhao</a><sup>1</sup>   <div> <div> <a href='https://zengyh1900.github.io/' target='_blank'>Yanhong Zeng</a><sup>2</sup>   <a href='https://zzh-tech.github.io/' target='_blank'>Zhihang Zhong</a><sup>2</sup>   <a href='https://jimmysuen.github.io/' target='_blank'>Xiao Sun</a><sup>2</sup>   <a href='https://scholar.google.com/citations?user=JD-5DKcAAAAJ&hl=en' target='_blank'>Yinqiang Zheng</a><sup>1</sup>   </div> <div> <sup>1</sup> The University of Tokyo   <sup>2</sup> Shanghai Artificial Intelligence Laboratory   <sup>*</sup> Corresponding Authors   </div>

TL;DR

We leverage "3DGS Avatar + Background Video" as guidance for the video diffusion model to insert and animate anyone into any scene following given motion sequence.

🔥🔥🔥 New Features/Updates

(2025.07.22) We support Unified Sequence Parallel (USP) for multi-GPU inference.
(2025.07.03) We have released the cross-character inference script to replace the person in the source video!
(2025.07.02) Our Project Page 🏠 is online!
(2025.07.01) We have released the model and inference script to insert and animate the character into the background video following SMPLX motion sequences!
If you find this work interesting, please do not hesitate to give a ⭐!

📰 CODE RELEASE

[x] (2025.07.01) Release model checkpoint and cross-character inference script.
[x] (2025.07.03) Release the complete cross-character inference script including data preprocessing (mask parsing + SMPLX estimation + background inpainting).
[x] (2025.07.22) We support Unified Sequence Parallel (USP) for multi-GPU inference.
[ ] Release training codes.

⚙️ Environment Setup

🌍 Virtual Enviroment

conda create -n anicrafter python=3.10
conda activate anicrafter
bash install_cu124.sh

pip install xfuser  # Unified Sequence Parallel (USP) for multi-GPU inference

📦 Download Checkpoints

huggingface-cli download Wan-AI/Wan2.1-I2V-14B-720P --local-dir ./Wan2.1-I2V-14B-720P
huggingface-cli download MyNiuuu/Anicrafter_release --local-dir ./Anicrafter_release
mv ./Anicrafter_release/gfpgan ./gfpgan
mv ./Anicrafter_release/pretrained_models ./pretrained_models

🏃 Cross-Character Inference from Background Video and Motions

Run the following commands to insert and animate the character into the background video following SMPLX motion sequences. The pipeline consists of following key functions:

Reconstructing 3DGS Avatar from single image using LHM
Animating the 3DGS Avatar according to the SMPLX sequences to obtain the spatial aligned avatar renderings
Combine avatar rendering and background video to form the "Avatar + Background" condition
Run the diffusion model to obtain the final animation results

Single GPU Inference

python run_pipeline.py \
--ckpt_path ./pretrained_models/anicrafter \
--wan_base_ckpt_path ./Wan2.1-I2V-14B-720P \
--character_image_path ./demo/character_images/000000.jpg \
--scene_path ./demo/videos/scene_000000 \
--save_root ./infer_result

Multi-GPU Inference with Unified Sequence Parallel (USP) for acceleration

torchrun \
--nproc_per_node=8 \
--master_port=16333 \
run_pipeline_usp.py \
--ckpt_path ./pretrained_models/anicrafter \
--wan_base_ckpt_path ./Wan2.1-I2V-14B-720P \
--character_image_path ./demo/character_images/000000.jpg \
--scene_path ./demo/videos/scene_000000 \
--save_root ./infer_result

🏃 Cross-Character Inference from in-the-wild Videos

Run the following commands to replace the person in the source video with our complete data preprocessing pipeline, which contains the following components:

Parsing human masks
Estimating SMPLX parameters and rendering SMPLX mesh videos
Background inpainting based on the human masks
Reconstructing 3DGS Avatar from single image using LHM
Animating the 3DGS Avatar according to the SMPLX sequences to obtain the spatial aligned avatar renderings
Combine avatar rendering and background video to form the "Avatar + Background" condition
Run the diffusion model to obtain the final animation results

⚙️ Additional Environment Setup


cd engine/pose_estimation
pip install mmcv==1.3.9
pip install -v -e third-party/ViTPose
pip install ultralytics
pip install av
cd ../..
pip install numpy==1.23.5


mkdir weights
cd weights
wget https://github.com/sczhou/ProPainter/releases/download/v0.1.0/cutie-base-mega.pth
wget https://github.com/sczhou/ProPainter/releases/download/v0.1.0/i3d_rgb_imagenet.pt
wget https://github.com/sczhou/ProPainter/releases/download/v0.1.0/ProPainter.pth
wget https://github.com/sczhou/ProPainter/releases/download/v0.1.0/raft-things.pth
wget https://github.com/sczhou/ProPainter/releases/download/v0.1.0/recurrent_flow_completion.pth
cd ..

# or you can mannually download from https://github.com/sczhou/ProPainter/releases/tag/v0.1.0

Single GPU Inference

# Mask + SMPLX + Inpainting + Avatar Recon + Rendering + Diffusion
# You could change the hyper-parameters of the inpainting algorithm to obtain optimal results

python run_pipeline_with_preprocess.py \
--video_root ./demo/origin_videos/raw_video \
--ckpt_path ./pretrained_models/anicrafter \
--wan_base_ckpt_path ./Wan2.1-I2V-14B-720P \
--character_image_path ./demo/character_images/000000.jpg \
--save_root ./infer_result_replace

Multi-GPU Inference with Unified Sequence Parallel (USP) for acceleration

# Mask + SMPLX + Inpainting + Avatar Recon + Rendering + Diffusion
# You could change the hyper-parameters of the inpainting algorithm to obtain optimal results

torchrun \
--nproc_per_node=8 \
--master_port=16333 \
run_pipeline_with_preprocess_usp.py \
--video_root ./demo/origin_videos/raw_video \
--ckpt_path ./pretrained_models/anicrafter \
--wan_base_ckpt_path ./Wan2.1-I2V-14B-720P \
--character_image_path ./demo/character_images/000000.jpg \
--save_root ./infer_result_replace

Citation

@article{niu2025anicrafter,
  title={AniCrafter: Customizing Realistic Human-Centric Animation via Avatar-Background Conditioning in Video Diffusion Models},
  author={Niu, Muyao and Cao, Mingdeng and Zhan, Yifan and Zhu, Qingtian and Ma, Mingze and Zhao, Jiancheng and Zeng, Yanhong and Zhong, Zhihang and Sun, Xiao and Zheng, Yinqiang},
  journal={arXiv preprint arXiv:2505.20255},
  year={2025}
}

Acknowledgements

We sincerely appreciate the code release of the following projects: LHM, Unianimate-DiT, Diffusers, and DiffSynth-Studio

AniCrafter

Install / Use

README

TL;DR

🔥🔥🔥 New Features/Updates

📰 CODE RELEASE

⚙️ Environment Setup

🌍 Virtual Enviroment

📦 Download Checkpoints

🏃 Cross-Character Inference from Background Video and Motions

Single GPU Inference

Multi-GPU Inference with Unified Sequence Parallel (USP) for acceleration

🏃 Cross-Character Inference from in-the-wild Videos

⚙️ Additional Environment Setup

Single GPU Inference

Multi-GPU Inference with Unified Sequence Parallel (USP) for acceleration

Citation

Acknowledgements