SkillAgentSearch skills...

Sana

SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformer

Install / Use

/learn @NVlabs/Sana
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

<p align="center" style="border-radius: 10px"> <img src="asset/logo.png" width="35%" alt="logo"/> </p> <h3 align="center"> <a href="https://nvlabs.github.io/Sana/docs/"><b>📚 Docs</b></a> | <a href="https://nvlabs.github.io/Sana/"><b>SANA</b></a> | <a href="https://nvlabs.github.io/Sana/Sana-1.5/"><b>SANA-1.5</b></a> | <a href="https://nvlabs.github.io/Sana/Sprint/"><b>SANA-Sprint</b></a> | <a href="https://nvlabs.github.io/Sana/Video/"><b>SANA-Video</b></a> | <a href="https://nv-sana.mit.edu/"><b>Demo</b></a> | <a href="https://huggingface.co/collections/Efficient-Large-Model/sana"><b>🤗 HuggingFace</b></a>

<a href="https://github.com/lawrence-cj/ComfyUI_ExtraModels"><b>ComfyUI</b></a> | <a href="https://github.com/sgl-project/sglang"><b>SGLang</b></a> | <a href="https://github.com/nvidia-cosmos/cosmos-rl/blob/main/examples/sana.md"><b>Cosmos-RL</b></a>

</h3> <p align="center"> <a href="https://nv-sana.mit.edu/"><img src="https://img.shields.io/static/v1?label=Demo:6x3090&message=SANA&color=yellow"></a> &ensp; <a href="https://nv-sana.mit.edu/4bit/"><img src="https://img.shields.io/static/v1?label=Demo:1x3090&message=4bit&color=yellow"></a> &ensp; <a href="https://nv-sana.mit.edu/ctrlnet/"><img src="https://img.shields.io/static/v1?label=Demo:1x3090&message=ControlNet&color=yellow"></a> &ensp; <a href="https://nv-sana.mit.edu/sprint/"><img src="https://img.shields.io/static/v1?label=Demo:1x3090&message=SANA-Sprint&color=yellow"></a> &ensp; <a href="https://huggingface.co/spaces/Efficient-Large-Model/SanaSprint"><img src="https://img.shields.io/static/v1?label=Huggingface%20Demo&message=SANA-Sprint&color=yellow"></a> &ensp; </p> <p align="center"> <a href="https://replicate.com/chenxwh/sana"><img src="https://img.shields.io/static/v1?label=API:H100&message=Replicate&color=pink"></a> &ensp; <a href="https://discord.gg/rde6eaE5Ta"><img src="https://img.shields.io/static/v1?label=Discuss&message=Discord&color=purple&logo=discord"></a> &ensp; </p> <h4 align="center">ICLR 2025 Oral | ICML 2025 | ICCV 2025 Highlight | ICLR 2026 Oral </h4>

SANA is an efficiency-oriented codebase for high-resolution image and video generation, providing complete training and inference pipelines. This repository contains code for SANA, SANA-1.5, SANA-Sprint, and SANA-Video. More details can be found in our 📚 documentation.

Join our Discord to engage in discussions with the community! If you have any questions, run into issues, or are interested in contributing, don't hesitate to reach out!

<p align="center" border-radius="10px"> <img src="asset/Sana.jpg" width="90%" alt="teaser_page1"/> </p>

News

  • 🔥 [2026/03] 📺 SANA-Video 720p model with LTX-VAE is released. Use it with LTX2 Refiner to upscale the videos to 2K resolution! See Model Zoo, SANA-Video doc and Blog about refiner.
  • 🔥 [2026/03] 💪 Post Training Infra: SANA × Cosmos-RL — We partner with Cosmos-RL to provide a complete RL infrastructure for SANA. You can now post-train (SFT/RL) SANA-Image and SANA-Video with state-of-the-art algorithms (e.g. Diffusion-NFT, Flow-GRPO), preset configs, reward services, and flexible datasets. See SANA on Cosmos-RL and our Cosmos-RL integration doc.
  • 🔥 [2026/02] 🚀 SANA is now supported in SGLang! High-performance serving with OpenAI-compatible API. [Guidance]
  • 🔥 [2026/01/26] SANA-Video is accepted as Oral by ICLR-2026. 🎉🎉🎉
  • 🔥 [2025/12/09] 🎬 LongSANA: 27FPS real-time minute-length video generation model, training and inference code are all released. Thanks to LongLive Team. Refer to: [Train] | [Test] | [Weight]
  • 🔥 [2025/11/24] 🪶 Blog: how Causal Linear Attention unlocks infinite context for LLMs and long video generation.
  • 🔥 [2025/11/9] 🎬 Introduction video shows how Block Causal Linear Attention and Causal Mix-FFN work?
  • 🔥 [2025/11/6] 📺SANA-Video is merged into diffusers. How to use.
  • 🔥 [2025/10/27] 📺SANA-Video is released. [README] | [Weights] support Text-to-Video, TextImage-to-Video.
  • 🔥 [2025/10/13] 📺SANA-Video is coming, 1). a 5s Linear DiT Video model, and 2). real-time minute-length video generation (with LongLive). [paper] | [Page]
<details> <summary>Click to show all updates</summary>
  • ✅ [2025/8/20] We release a new DC-AE-Lite for faster inference and smaller memory. [How to config] | [diffusers PR] | [Weight]
  • ✅ [2025/6/25] SANA-Sprint was accepted to ICCV'25 🏖️
  • ✅ [2025/6/4] SANA-Sprint ComfyUI Node is released [Example].
  • ✅ [2025/5/8] SANA-Sprint (One-step diffusion) diffusers training code is released [Guidance].
  • ✅ [2025/5/4] SANA-1.5 (Inference-time scaling) is accepted by ICML-2025. 🎉🎉🎉
  • ✅ [2025/3/22] 🔥SANA-Sprint demo is hosted on Huggingface, try it! 🎉 [Demo Link]
  • ✅ [2025/3/22] 🔥SANA-1.5 is supported in ComfyUI! 🎉: ComfyUI Guidance | ComfyUI Work Flow SANA-1.5 4.8B
  • ✅ [2025/3/22] 🔥SANA-Sprint code & weights are released! 🎉 Include: Training & Inference code and Weights / HF are all released. [Guidance]
  • ✅ [2025/3/21] 🚀Sana + Inference Scaling is released. [Guidance]
  • ✅ [2025/3/16] 🔥SANA-1.5 code & weights are released! 🎉 Include: DDP/FSDP | TAR file WebDataset | Multi-Scale Training code and Weights | HF are all released.
  • ✅ [2025/3/14] 🏃SANA-Sprint is coming out! 🎉 A new one/few-step generator of Sana. 0.1s per 1024px image on H100, 0.3s on RTX 4090. Find out more details: [Page] | [Arxiv]. Code is coming very soon along with diffusers
  • ✅ [2025/2/10] 🚀Sana + ControlNet is released. [Guidance] | [Model] | [Demo]
  • ✅ [2025/1/30] Release CAME-8bit optimizer code. Saving more GPU memory during training. [How to config]
  • ✅ [2025/1/29] 🎉 🎉 🎉SANA 1.5 is out! Figure out how to do efficient training & inference scaling! 🚀[Tech Report]
  • ✅ [2025/1/24] 4bit-Sana is released, powered by SVDQuant and Nunchaku inference engine. Now run your Sana within 8GB GPU VRAM [Guidance] [Demo] [Model]
  • ✅ [2025/1/24] DCAE-1.1 is released, better reconstruction quality. [Model] [diffusers]
  • ✅ [2025/1/23] Sana is accepted as Oral by ICLR-2025. 🎉🎉🎉
  • ✅ [2025/1/12] DC-AE tiling makes Sana-4K inferences 4096x4096px images within 22GB GPU memory. With model offload and 8bit/4bit quantize. The 4K Sana run within 8GB GPU VRAM. [Guidance]
  • ✅ [2025/1/11] Sana code-base license changed to Apache 2.0.
  • ✅ [2025/1/10] Inference Sana with 8bit quantization.[Guidance]
  • ✅ [2025/1/8] 4K r

Related Skills

View on GitHub
GitHub Stars5.0k
CategoryContent
Updated5h ago
Forks338

Languages

Python

Security Score

100/100

Audited on Mar 30, 2026

No findings