SkillAgentSearch skills...

HiDiffusion

[ECCV 2024] HiDiffusion: Increases the resolution and speed of your diffusion model by only adding a single line of code!

Install / Use

/learn @megvii-research/HiDiffusion
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

<!-- # 💡 HiDiffusion --> <div align="center"> <img src="assets/hidiffusion_logo.jpg" height=120> </div>

<div align="center">💡 HiDiffusion: Unlocking Higher-Resolution Creativity and Efficiency in Pretrained Diffusion Models</div>

<div align="center"><a href="https://scholar.google.com/citations?hl=zh-CN&user=QFowS4cAAAAJ">Shen Zhang</a>, Zhaowei Chen, Zhenyu Zhao, Yuhao Chen, Yao Tang, <a href="https://jiajunvision.github.io/">Jiajun Liang</a></div> <br> <div align="center"> <a href="https://hidiffusion.github.io/"><img src="https://img.shields.io/static/v1?label=Project%20Page&message=Github&color=blue&logo=github-pages"></a> &ensp; <a href="https://link.springer.com/chapter/10.1007/978-3-031-72983-6_9"><img src="https://img.shields.io/static/v1?label=Paper&message=ECCV&color=yellow"></a> &ensp; <a href="https://arxiv.org/abs/2311.17528"><img src="https://img.shields.io/static/v1?label=Paper&message=Arxiv&color=red&logo=arxiv"></a> &ensp; <a href="https://colab.research.google.com/drive/1EiBn9lSnPZTU4cikRRaBBexs429M-qty?usp=sharing"><img src="https://img.shields.io/static/v1?label=Demo&message=Colab&color=purple&logo=googlecolab"></a> &ensp; <a href="https://openbayes.com/console/public/tutorials/SaPYcYCaWSA"><img src="https://img.shields.io/static/v1?label=Demo&message=OpenBayes&color=green"></a> &ensp; </div> <div align="center"> <img src="assets/image_gallery.jpg" width="800" ></img> <br> <em> (Select HiDiffusion samples for various diffusion models, resolutions, and aspect ratios.) </em> </div> <br>

👉 Why HiDiffusion

  • A training-free method that increases the resolution and speed of pretrained diffusion models.
  • Designed as a plug-and-play implementation. It can be integrated into diffusion pipelines by only adding a single line of code!
  • Supports various tasks, including text-to-image, image-to-image, inpainting.
<div align="center"> <img src="assets/quality_efficiency.jpg" width="800" ></img> <br> <em> (Faster, and better image details.) </em> </div> <br> <div align="center"> <img src="assets/various_task.jpg" width="800" ></img> <br> <em> (2K results of ControlNet and inpainting tasks.) </em> </div> <br>

🔥 Update

  • 2024.8.15 - 💥 Diffusers documentation has added HiDiffusion, see here. Thank Diffusers team!

  • 2024.7.3 - 💥 Accepted by ECCV 2024!

  • 2024.6.19 - 💥 Integrated into OpenBayes, see the demo. Thank OpenBayes team!

  • 2024.6.16 - 💥 Support PyTorch 2.X.

  • 2024.6.16 - 💥 Fix non-square generation issue. Now HiDiffusion supports more image sizes and aspect ratios.

  • 2024.5.7 - 💥 Support image-to-image task, see here.

  • 2024.4.16 - 💥 Release source code.

📢 Supported Models

Note: HiDiffusion also supports the downstream diffusion models based on these repositories, such as Ghibli-Diffusion, Playground, etc.

💣 Supported Tasks

  • ✅ Text-to-image
  • ✅ ControlNet, including text-to-image, image-to-image
  • ✅ Inpainting

🔎 Main Requirements

This repository is tested on

  • Python==3.8
  • torch>=1.13.1
  • diffusers>=0.25.0
  • transformers
  • accelerate
  • xformers

🔑 Install HiDiffusion

After installing the packages in the main requirements, install HiDiffusion:

pip3 install hidiffusion

Installing from source

Alternatively, you can install from github source. Clone the repository and install:

git clone https://github.com/megvii-model/HiDiffusion.git
cd HiDiffusion
python3 setup.py install

🚀 Usage

Generating outputs with HiDiffusion is super easy based on 🤗 diffusers. You just need to add a single line of code.

Text-to-image generation

Stable Diffusion XL

from hidiffusion import apply_hidiffusion, remove_hidiffusion
from diffusers import StableDiffusionXLPipeline, DDIMScheduler
import torch
pretrain_model = "stabilityai/stable-diffusion-xl-base-1.0"
scheduler = DDIMScheduler.from_pretrained(pretrain_model, subfolder="scheduler")
pipe = StableDiffusionXLPipeline.from_pretrained(pretrain_model, scheduler = scheduler, torch_dtype=torch.float16, variant="fp16").to("cuda")

# # Optional. enable_xformers_memory_efficient_attention can save memory usage and increase inference speed. enable_model_cpu_offload and enable_vae_tiling can save memory usage.
# pipe.enable_xformers_memory_efficient_attention()
# pipe.enable_model_cpu_offload()
# pipe.enable_vae_tiling()

# Apply hidiffusion with a single line of code.
apply_hidiffusion(pipe)

prompt = "Standing tall amidst the ruins, a stone golem awakens, vines and flowers sprouting from the crevices in its body."
negative_prompt = "blurry, ugly, duplicate, poorly drawn face, deformed, mosaic, artifacts, bad limbs"
image = pipe(prompt, guidance_scale=7.5, height=2048, width=2048, eta=1.0, negative_prompt=negative_prompt).images[0]
image.save(f"golem.jpg")
<details> <summary>Output:</summary> <div align="center"> <img src="assets/sdxl.jpg" width="800" ></img> </div> </details>

Set height = 4096, width = 4096, and you can get output with 4096x4096 resolution.

Stable Diffusion XL Turbo

from hidiffusion import apply_hidiffusion, remove_hidiffusion
from diffusers import AutoPipelineForText2Image
import torch
pretrain_model = "stabilityai/sdxl-turbo"
pipe = AutoPipelineForText2Image.from_pretrained(pretrain_model, torch_dtype=torch.float16, variant="fp16").to('cuda')

# # Optional. enable_xformers_memory_efficient_attention can save memory usage and increase inference speed. enable_model_cpu_offload and enable_vae_tiling can save memory usage.
# pipe.enable_xformers_memory_efficient_attention()
# pipe.enable_model_cpu_offload()
# pipe.enable_vae_tiling()

# Apply hidiffusion with a single line of code.
apply_hidiffusion(pipe)

prompt = "In the depths of a mystical forest, a robotic owl with night vision lenses for eyes watches over the nocturnal creatures."
image = pipe(prompt, num_inference_steps=4, height=1024, width=1024, guidance_scale=0.0).images[0]
image.save(f"./owl.jpg")
<details> <summary>Output:</summary> <div align="center"> <img src="assets/sdxl_turbo.jpg" width="800" ></img> </div> </details>

Stable Diffusion v2-1

from hidiffusion import apply_hidiffusion, remove_hidiffusion
from diffusers import DiffusionPipeline, DDIMScheduler
import torch
pretrain_model = "stabilityai/stable-diffusion-2-1-base"
scheduler = DDIMScheduler.from_pretrained(pretrain_model, subfolder="scheduler")
pipe = DiffusionPipeline.from_pretrained(pretrain_model, scheduler = scheduler, torch_dtype=torch.float16).to("cuda")

# # Optional. enable_xformers_memory_efficient_attention can save memory usage and increase inference speed. enable_model_cpu_offload and enable_vae_tiling can save memory usage.
# pipe.enable_xformers_memory_efficient_attention()
# pipe.enable_model_cpu_offload()
# pipe.enable_vae_tiling()

# Apply hidiffusion with a single line of code.
apply_hidiffusion(pipe)

prompt = "An adorable happy brown border collie sitting on a bed, high detail."
negative_prompt = "ugly, tiling, out of frame, poorly drawn face, extra limbs, disfigured, deformed, body out of frame, blurry, bad anatomy, blurred, artifacts, bad proportions."
image = pipe(prompt, guidance_scale=7.5, height=1024, width=1024, eta=1.0, negative_prompt=negative_prompt).images[0]
image.save(f"collie.jpg")
<details> <summary>Output:</summary> <div align="center"> <img src="assets/sd21.jpg" width="800" ></img> </div> </details>

Set height = 2048, width = 2048, and you can get output with 2048x2048 resolution.

Stable Diffusion v1-5

from hidiffusion import apply_hidiffusion, remove_hidiffusion
from diffusers import DiffusionPipeline, DDIMScheduler
import torch
pretrain_model = "runwayml/stable-diffusion-v1-5"
scheduler = DDIMScheduler.from_pretrained(pretrain_model, subfolder="scheduler")
pipe = DiffusionPipeline.from_pretrained(pretrain_model, scheduler = scheduler, torch_dtype=torch.float16).to("cuda")

# # Optional. enable_xformers_memory_efficient_attention can save memory usage and increase inference speed. enable_model_cpu_offload and enable_vae_tiling can save memory usage.
# pipe.enable_xformers_memory_efficient_attention()
# pipe.enable_model_cpu_offload()
# pipe.enable_vae_tiling()

# Apply hidiffusion with a single line of code.
apply_hidiffusion(pipe)

prompt = "thick strokes, bright colors, an exotic fox, cute, chibi kawaii. detailed fur, hyperdetailed , big reflective eyes, fairytale, artstation,centered composition, perfect composition, centered, vibrant colors, muted colors, high detailed, 8k."
negative_prompt = "ugly, tiling, poorly drawn face, out of frame, disfigured, deformed, blurry, bad anatomy, blurred."
image = pipe(prompt, guidance_scale=7.5, height=1024, width=1024, eta=1.0, negative_prompt=negative_prompt).images[0]
image.save(f"fox.jpg")
<details> <summary>Output:</summary> <div align="center"> <img src="assets/sd15.jpg" width="800" ></img> </div> </details>

Set height = 2048, width = 2048, and you can get output with 2048x2048 resolution.

Remove HiDiffusion

If you want to remove HiDiiffusion, simply use remove_hidiffusion(pipe).

ControlNet

Text-to-image generation

from diffuser

Related Skills

View on GitHub
GitHub Stars839
CategoryDevelopment
Updated2d ago
Forks44

Languages

Jupyter Notebook

Security Score

95/100

Audited on Mar 25, 2026

No findings