SkillAgentSearch skills...

Wan2.1

Wan: Open and Advanced Large-Scale Video Generative Models

Install / Use

/learn @Wan-Video/Wan2.1
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

Wan2.1

<p align="center"> <img src="assets/logo.png" width="400"/> <p> <p align="center"> 💜 <a href="https://wan.video"><b>Wan</b></a> &nbsp&nbsp | &nbsp&nbsp 🖥️ <a href="https://github.com/Wan-Video/Wan2.1">GitHub</a> &nbsp&nbsp | &nbsp&nbsp🤗 <a href="https://huggingface.co/Wan-AI/">Hugging Face</a>&nbsp&nbsp | &nbsp&nbsp🤖 <a href="https://modelscope.cn/organization/Wan-AI">ModelScope</a>&nbsp&nbsp | &nbsp&nbsp 📑 <a href="https://arxiv.org/abs/2503.20314">Technical Report</a> &nbsp&nbsp | &nbsp&nbsp 📑 <a href="https://wan.video/welcome?spm=a2ty_o02.30011076.0.0.6c9ee41eCcluqg">Blog</a> &nbsp&nbsp | &nbsp&nbsp💬 <a href="https://gw.alicdn.com/imgextra/i2/O1CN01tqjWFi1ByuyehkTSB_!!6000000000015-0-tps-611-1279.jpg">WeChat Group</a>&nbsp&nbsp | &nbsp&nbsp 📖 <a href="https://discord.gg/AKNgpMK4Yj">Discord</a>&nbsp&nbsp <br>

Wan: Open and Advanced Large-Scale Video Generative Models <be>

In this repository, we present Wan2.1, a comprehensive and open suite of video foundation models that pushes the boundaries of video generation. Wan2.1 offers these key features:

  • 👍 SOTA Performance: Wan2.1 consistently outperforms existing open-source models and state-of-the-art commercial solutions across multiple benchmarks.
  • 👍 Supports Consumer-grade GPUs: The T2V-1.3B model requires only 8.19 GB VRAM, making it compatible with almost all consumer-grade GPUs. It can generate a 5-second 480P video on an RTX 4090 in about 4 minutes (without optimization techniques like quantization). Its performance is even comparable to some closed-source models.
  • 👍 Multiple Tasks: Wan2.1 excels in Text-to-Video, Image-to-Video, Video Editing, Text-to-Image, and Video-to-Audio, advancing the field of video generation.
  • 👍 Visual Text Generation: Wan2.1 is the first video model capable of generating both Chinese and English text, featuring robust text generation that enhances its practical applications.
  • 👍 Powerful Video VAE: Wan-VAE delivers exceptional efficiency and performance, encoding and decoding 1080P videos of any length while preserving temporal information, making it an ideal foundation for video and image generation.

Video Demos

<div align="center"> <video src="https://github.com/user-attachments/assets/4aca6063-60bf-4953-bfb7-e265053f49ef" width="70%" poster=""> </video> </div>

🔥 Latest News!!

  • May 14, 2025: 👋 We introduce Wan2.1 VACE, an all-in-one model for video creation and editing, along with its inference code, weights, and technical report!
  • Apr 17, 2025: 👋 We introduce Wan2.1 FLF2V with its inference code and weights!
  • Mar 21, 2025: 👋 We are excited to announce the release of the Wan2.1 technical report. We welcome discussions and feedback!
  • Mar 3, 2025: 👋 Wan2.1's T2V and I2V have been integrated into Diffusers (T2V | I2V). Feel free to give it a try!
  • Feb 27, 2025: 👋 Wan2.1 has been integrated into ComfyUI. Enjoy!
  • Feb 25, 2025: 👋 We've released the inference code and weights of Wan2.1.

Community Works

If your work has improved Wan2.1 and you would like more people to see it, please inform us.

  • Helios, a breakthrough video generation model base on Wan2.1 that achieves minute-scale, high-quality video synthesis at 19.5 FPS on a single H100 GPU (about 10 FPS on a single Ascend NPU) —without relying on conventional long video anti-drifting strategies or standard video acceleration techniques. Visit their webpage for more details.
  • Video-As-Prompt, the first unified semantic-controlled video generation model based on Wan2.1-14B-I2V with a Mixture-of-Transformers architecture and in-context controls (e.g., concept, style, motion, camera). Refer to the project page for more examples.
  • LightX2V, a lightweight and efficient video generation framework that integrates Wan2.1 and Wan2.2, supports multiple engineering acceleration techniques for fast inference, which can run on RTX 5090 and RTX 4060 (8GB VRAM).
  • DriVerse, an autonomous driving world model based on Wan2.1-14B-I2V, generates future driving videos conditioned on any scene frame and given trajectory. Refer to the project page for more examples.
  • Training-Free-WAN-Editing, built on Wan2.1-T2V-1.3B, allows training-free video editing with image-based training-free methods, such as FlowEdit and FlowAlign.
  • Wan-Move, accepted to NeurIPS 2025, a framework that brings Wan2.1-I2V-14B to SOTA fine-grained, point-level motion control! Refer to their project page for more information.
  • EchoShot, a native multi-shot portrait video generation model based on Wan2.1-T2V-1.3B, allows generation of multiple video clips featuring the same character as well as highly flexible content controllability. Refer to their project page for more information.
  • AniCrafter, a human-centric animation model based on Wan2.1-14B-I2V, controls the Video Diffusion Models with 3DGS Avatars to insert and animate anyone into any scene following given motion sequences. Refer to the project page for more examples.
  • HyperMotion, a human image animation framework based on Wan2.1, addresses the challenge of generating complex human body motions in pose-guided animation. Refer to their website for more examples.
  • MagicTryOn, a video virtual try-on framework built upon Wan2.1-14B-I2V, addresses the limitations of existing models in expressing garment details and maintaining dynamic stability during human motion. Refer to their website for more examples.
  • ATI, built on Wan2.1-I2V-14B, is a trajectory-based motion-control framework that unifies object, local, and camera movements in video generation. Refer to their website for more examples.
  • Phantom has developed a unified video generation framework for single and multi-subject references based on both Wan2.1-T2V-1.3B and Wan2.1-T2V-14B. Please refer to their examples.
  • UniAnimate-DiT, based on Wan2.1-14B-I2V, has trained a Human image animation model and has open-sourced the inference and training code. Feel free to enjoy it!
  • CFG-Zero enhances Wan2.1 (covering both T2V and I2V models) from the perspective of CFG.
  • TeaCache now supports Wan2.1 acceleration, capable of increasing speed by approximately 2x. Feel free to give it a try!
  • DiffSynth-Studio provides more support for Wan2.1, including video-to-video, FP8 quantization, VRAM optimization, LoRA training, and more. Please refer to their examples.

📑 Todo List

  • Wan2.1 Text-to-Video
    • [x] Multi-GPU Inference code of the 14B and 1.3B models
    • [x] Checkpoints of the 14B and 1.3B models
    • [x] Gradio demo
    • [x] ComfyUI integration
    • [x] Diffusers integration
    • [ ] Diffusers + Multi-GPU Inference
  • Wan2.1 Image-to-Video
    • [x] Multi-GPU Inference code of the 14B model
    • [x] Checkpoints of the 14B model
    • [x] Gradio demo
    • [x] ComfyUI integration
    • [x] Diffusers integration
    • [ ] Diffusers + Multi-GPU Inference
  • Wan2.1 First-Last-Frame-to-Video
    • [x] Multi-GPU Inference code of the 14B model
    • [x] Checkpoints of the 14B model
    • [x] Gradio demo
    • [ ] ComfyUI integration
    • [ ] Diffusers integration
    • [ ] Diffusers + Multi-GPU Inference
  • Wan2.1 VACE
    • [x] Multi-GPU Inference code of the 14B and 1.3B models
    • [x] Checkpoints of the 14B and 1.3B models
    • [x] Gradio demo
    • [x] ComfyUI integration
    • [ ] Diffusers integration
    • [ ] Diffusers + Multi-GPU Inference

Quickstart

Installation

Clone the repo:

git clone https://github.com/Wan-Video/Wan2.1.git
cd Wan2.1

Install dependencies:

# Ensure torch >= 2.4.0
pip install -r requirements.txt

Model Download

| Models | Download Link | Notes | |--------------|---------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------| | T2V-14B

Related Skills

View on GitHub
GitHub Stars15.7k
CategoryContent
Updated2h ago
Forks2.5k

Languages

Python

Security Score

100/100

Audited on Mar 31, 2026

No findings