HunyuanVideo
HunyuanVideo: A Systematic Framework For Large Video Generation Model
Install / Use
/learn @Tencent-Hunyuan/HunyuanVideoREADME
HunyuanVideo: A Systematic Framework For Large Video Generation Model
<div align="center"> <a href="https://github.com/Tencent-Hunyuan/HunyuanVideo"><img src="https://img.shields.io/static/v1?label=HunyuanVideo Code&message=Github&color=blue"></a>   <a href="https://aivideo.hunyuan.tencent.com"><img src="https://img.shields.io/static/v1?label=Project%20Page&message=Web&color=green"></a>   <a href="https://video.hunyuan.tencent.com"><img src="https://img.shields.io/static/v1?label=Playground&message=Web&color=green"></a> </div> <div align="center"> <a href="https://arxiv.org/abs/2412.03603"><img src="https://img.shields.io/static/v1?label=Tech Report&message=Arxiv&color=red"></a>   <a href="https://aivideo.hunyuan.tencent.com/hunyuanvideo.pdf"><img src="https://img.shields.io/static/v1?label=Tech Report&message=High-Quality Version (~350M)&color=red"></a> </div> <div align="center"> <a href="https://huggingface.co/tencent/HunyuanVideo"><img src="https://img.shields.io/static/v1?label=HunyuanVideo&message=HuggingFace&color=yellow"></a>   <a href="https://huggingface.co/docs/diffusers/main/api/pipelines/hunyuan_video"><img src="https://img.shields.io/static/v1?label=HunyuanVideo&message=Diffusers&color=yellow"></a>   <a href="https://huggingface.co/tencent/HunyuanVideo-PromptRewrite"><img src="https://img.shields.io/static/v1?label=HunyuanVideo-PromptRewrite&message=HuggingFace&color=yellow"></a> </div> <p align="center"> 👋 Join our <a href="assets/WECHAT.md" target="_blank">WeChat</a> and <a href="https://discord.gg/tv7FkG4Nwf" target="_blank">Discord</a> </p> <p align="center">This repo contains PyTorch model definitions, pre-trained weights and inference/sampling code for our paper exploring HunyuanVideo. You can find more visualizations on our project page.
HunyuanVideo: A Systematic Framework For Large Video Generation Model <be>
🔥🔥🔥 News!!
- Nov 21, 2025: 🎉 We release the HunyuanVideo-1.5, a highly efficient and powerful new foundation model.
- May 28, 2025: 💃 We release the HunyuanVideo-Avatar, an audio-driven human animation model based on HunyuanVideo.
- May 09, 2025: 🙆 We release the HunyuanCustom, a multimodal-driven architecture for customized video generation based on HunyuanVideo.
- Mar 06, 2025: 🌅 We release the HunyuanVideo-I2V, an image-to-video model based on HunyuanVideo.
- Jan 13, 2025: 📈 We release the Penguin Video Benchmark.
- Dec 18, 2024: 🏃♂️ We release the FP8 model weights of HunyuanVideo to save more GPU memory.
- Dec 17, 2024: 🤗 HunyuanVideo has been integrated into Diffusers.
- Dec 7, 2024: 🚀 We release the parallel inference code for HunyuanVideo powered by xDiT.
- Dec 3, 2024: 👋 We release the inference code and model weights of HunyuanVideo. Download.
🎥 Demo
<div align="center"> <video src="https://github.com/user-attachments/assets/22440764-0d7e-438e-a44d-d0dad1006d3d" width="70%" poster="./assets/video_poster.png"> </video> </div>🧩 Community Contributions
If you develop/use HunyuanVideo in your projects, welcome to let us know.
-
ComfyUI-Kijai (FP8 Inference, V2V and IP2V Generation): ComfyUI-HunyuanVideoWrapper by Kijai
-
ComfyUI-Native (Native Support): ComfyUI-HunyuanVideo by ComfyUI Official
-
FastVideo (Consistency Distilled Model and Sliding Tile Attention): FastVideo and Sliding Tile Attention by Hao AI Lab
-
HunyuanVideo-gguf (GGUF Version and Quantization): HunyuanVideo-gguf by city96
-
Enhance-A-Video (Better Generated Video for Free): Enhance-A-Video by NUS-HPC-AI-Lab
-
HunyuanVideoGP (GPU Poor version): HunyuanVideoGP by DeepBeepMeep
-
RIFLEx (Video Length Extrapolation): RIFLEx by Tsinghua University
-
HunyuanVideo Keyframe Control Lora: hunyuan-video-keyframe-control-lora by dashtoon
-
Sparse-VideoGen (Accelerate Video Generation with High Pixel-level Fidelity): Sparse-VideoGen by University of California, Berkeley
-
FramePack (Packing Input Frame Context in Next-Frame Prediction Models for Video Generation): FramePack by Lvmin Zhang
-
Jenga (Training-Free Efficient Video Generation via Dynamic Token Carving): Jenga by DV Lab
-
DCM (Dual-Expert Consistency Model for Efficient and High-Quality Video Generation): DCM by Vchitect
📑 Open-source Plan
- HunyuanVideo (Text-to-Video Model)
- [x] Inference
- [x] Checkpoints
- [x] Multi-gpus Sequence Parallel inference (Faster inference speed on more gpus)
- [x] Web Demo (Gradio)
- [x] Diffusers
- [x] FP8 Quantified weight
- [x] Penguin Video Benchmark
- [x] ComfyUI
- HunyuanVideo (Image-to-Video Model)
- [X] Inference
- [X] Checkpoints
Contents
- HunyuanVideo: A Systematic Framework For Large Video Generation Model
- 🎥 Demo
- 🔥🔥🔥 News!!
- 🧩 Community Contributions
- 📑 Open-source Plan
- Contents
- Abstract
- HunyuanVideo Overall Architecture
- 🎉 HunyuanVideo Key Features
- 📈 Comparisons
- 📜 Requirements
- 🛠️ Dependencies and Installation
- 🧱 Download Pretrained Models
- 🔑 Single-gpu Inference
- 🚀 Parallel Inference on Multiple GPUs by xDiT
- 🚀 FP8 Inference
- 🔗 BibTeX
- Acknowledgements
- Star History
Abstract
We present HunyuanVideo, a novel open-source video foundation model that exhibits performance in video generation that is comparable to, if not superior to, leading closed-source models. In order to train HunyuanVideo model, we adopt several key technologies for model learning, including data curation, image-video joint model training, and an efficient infrastructure designed to facilitate large-scale model training and inference. Additionally, through an effective strategy for scaling model architecture and dataset, we successfully trained a video generative model with over 13 billion parameters, making it the largest among all open-source models.
We conducted extensive experiments and implemented a series of targeted designs to ensure high visual quality, motion diversity, text-video alignment, and generation stability. According to professional human evaluation results, HunyuanVideo outperforms previous state-of-the-art models, including Runway Gen-3, Luma 1.6, and 3 top-performing Chinese video generative models. By releasing the code and weights of the foundation model and its applications, we aim to bridge the gap between closed-source and open-source video foundation models. This initiative will empower everyone in the community to experiment with their ideas, fostering a more dynamic and vibrant video generation ecosystem.
HunyuanVideo Overall Architecture
HunyuanVideo is trained on a spatial-temporally compressed latent space, which is compressed through a Causal 3D
Related Skills
docs-writer
99.2k`docs-writer` skill instructions As an expert technical writer and editor for the Gemini CLI project, you produce accurate, clear, and consistent documentation. When asked to write, edit, or revie
model-usage
337.7kUse CodexBar CLI local cost usage to summarize per-model usage for Codex or Claude, including the current (most recent) model or a full model breakdown. Trigger when asked for model-level usage/cost data from codexbar, or when you need a scriptable per-model summary from codexbar cost JSON.
ddd
Guía de Principios DDD para el Proyecto > 📚 Documento Complementario : Este documento define los principios y reglas de DDD. Para ver templates de código, ejemplos detallados y guías paso
arscontexta
2.9kClaude Code plugin that generates individualized knowledge systems from conversation. You describe how you think and work, have a conversation and get a complete second brain as markdown files you own.
