SkillAgentSearch skills...

HunyuanVideo

HunyuanVideo: A Systematic Framework For Large Video Generation Model

Install / Use

/learn @Tencent-Hunyuan/HunyuanVideo
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

<!-- ## **HunyuanVideo** -->

中文阅读

<p align="center"> <img src="https://raw.githubusercontent.com/Tencent-Hunyuan/HunyuanVideo/refs/heads/main/assets/logo.png" height=100> </p>

HunyuanVideo: A Systematic Framework For Large Video Generation Model

<div align="center"> <a href="https://github.com/Tencent-Hunyuan/HunyuanVideo"><img src="https://img.shields.io/static/v1?label=HunyuanVideo Code&message=Github&color=blue"></a> &ensp; <a href="https://aivideo.hunyuan.tencent.com"><img src="https://img.shields.io/static/v1?label=Project%20Page&message=Web&color=green"></a> &ensp; <a href="https://video.hunyuan.tencent.com"><img src="https://img.shields.io/static/v1?label=Playground&message=Web&color=green"></a> </div> <div align="center"> <a href="https://arxiv.org/abs/2412.03603"><img src="https://img.shields.io/static/v1?label=Tech Report&message=Arxiv&color=red"></a> &ensp; <a href="https://aivideo.hunyuan.tencent.com/hunyuanvideo.pdf"><img src="https://img.shields.io/static/v1?label=Tech Report&message=High-Quality Version (~350M)&color=red"></a> </div> <div align="center"> <a href="https://huggingface.co/tencent/HunyuanVideo"><img src="https://img.shields.io/static/v1?label=HunyuanVideo&message=HuggingFace&color=yellow"></a> &ensp; <a href="https://huggingface.co/docs/diffusers/main/api/pipelines/hunyuan_video"><img src="https://img.shields.io/static/v1?label=HunyuanVideo&message=Diffusers&color=yellow"></a> &ensp; <a href="https://huggingface.co/tencent/HunyuanVideo-PromptRewrite"><img src="https://img.shields.io/static/v1?label=HunyuanVideo-PromptRewrite&message=HuggingFace&color=yellow"></a>

Replicate

</div> <p align="center"> 👋 Join our <a href="assets/WECHAT.md" target="_blank">WeChat</a> and <a href="https://discord.gg/tv7FkG4Nwf" target="_blank">Discord</a> </p> <p align="center">

This repo contains PyTorch model definitions, pre-trained weights and inference/sampling code for our paper exploring HunyuanVideo. You can find more visualizations on our project page.

HunyuanVideo: A Systematic Framework For Large Video Generation Model <be>

🔥🔥🔥 News!!

  • Nov 21, 2025: 🎉 We release the HunyuanVideo-1.5, a highly efficient and powerful new foundation model.
  • May 28, 2025: 💃 We release the HunyuanVideo-Avatar, an audio-driven human animation model based on HunyuanVideo.
  • May 09, 2025: 🙆 We release the HunyuanCustom, a multimodal-driven architecture for customized video generation based on HunyuanVideo.
  • Mar 06, 2025: 🌅 We release the HunyuanVideo-I2V, an image-to-video model based on HunyuanVideo.
  • Jan 13, 2025: 📈 We release the Penguin Video Benchmark.
  • Dec 18, 2024: 🏃‍♂️ We release the FP8 model weights of HunyuanVideo to save more GPU memory.
  • Dec 17, 2024: 🤗 HunyuanVideo has been integrated into Diffusers.
  • Dec 7, 2024: 🚀 We release the parallel inference code for HunyuanVideo powered by xDiT.
  • Dec 3, 2024: 👋 We release the inference code and model weights of HunyuanVideo. Download.

🎥 Demo

<div align="center"> <video src="https://github.com/user-attachments/assets/22440764-0d7e-438e-a44d-d0dad1006d3d" width="70%" poster="./assets/video_poster.png"> </video> </div>

🧩 Community Contributions

If you develop/use HunyuanVideo in your projects, welcome to let us know.

📑 Open-source Plan

  • HunyuanVideo (Text-to-Video Model)
    • [x] Inference
    • [x] Checkpoints
    • [x] Multi-gpus Sequence Parallel inference (Faster inference speed on more gpus)
    • [x] Web Demo (Gradio)
    • [x] Diffusers
    • [x] FP8 Quantified weight
    • [x] Penguin Video Benchmark
    • [x] ComfyUI
  • HunyuanVideo (Image-to-Video Model)
    • [X] Inference
    • [X] Checkpoints

Contents


Abstract

We present HunyuanVideo, a novel open-source video foundation model that exhibits performance in video generation that is comparable to, if not superior to, leading closed-source models. In order to train HunyuanVideo model, we adopt several key technologies for model learning, including data curation, image-video joint model training, and an efficient infrastructure designed to facilitate large-scale model training and inference. Additionally, through an effective strategy for scaling model architecture and dataset, we successfully trained a video generative model with over 13 billion parameters, making it the largest among all open-source models.

We conducted extensive experiments and implemented a series of targeted designs to ensure high visual quality, motion diversity, text-video alignment, and generation stability. According to professional human evaluation results, HunyuanVideo outperforms previous state-of-the-art models, including Runway Gen-3, Luma 1.6, and 3 top-performing Chinese video generative models. By releasing the code and weights of the foundation model and its applications, we aim to bridge the gap between closed-source and open-source video foundation models. This initiative will empower everyone in the community to experiment with their ideas, fostering a more dynamic and vibrant video generation ecosystem.

HunyuanVideo Overall Architecture

HunyuanVideo is trained on a spatial-temporally compressed latent space, which is compressed through a Causal 3D

Related Skills

View on GitHub
GitHub Stars11.9k
CategoryContent
Updated2h ago
Forks1.2k

Languages

Python

Security Score

85/100

Audited on Mar 27, 2026

No findings