DiffBIR

[ECCV 2024] codes of DiffBIR: Towards Blind Image Restoration with Generative Diffusion Prior

Generate Convert Improve

Install / Use

/learn @XPixelGroup/DiffBIR

About this skill

Quality Score

0/100

README

DiffBIR: Towards Blind Image Restoration with Generative Diffusion Prior

Paper | Project Page

Xinqi Lin1,*, Jingwen He2,3,*, Ziyan Chen1, Zhaoyang Lyu2, Bo Dai2, Fanghua Yu1, Wanli Ouyang2, Yu Qiao2, Chao Dong1,2

1Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences 2Shanghai AI Laboratory 3The Chinese University of Hong Kong

:star:If DiffBIR is helpful for you, please help star this repo. Thanks!:hugs:

<a name="update"></a>:new:Update

2025.07.29: ✅ We've released our new work HYPIR! It significantly outperforms DiffBIR🔥🔥🔥, while also being tens of times faster🔥🔥🔥. We welcome you to try it out.
2024.11.27: ✅ Release DiffBIR v2.1, including a new model trained on unsplash dataset with LLaVA-generated captions, more samplers, better tiled-sampling support and so on. Check release note for details.
2024.04.08: ✅ Release everything about our updated manuscript, including (1) a new model trained on subset of laion2b-en and (2) a more readable code base, etc. DiffBIR is now a general restoration pipeline that could handle different blind image restoration tasks with a unified generation module.
2023.09.19: ✅ Add support for Apple Silicon! Check installation_xOS.md to work with CPU/CUDA/MPS device!
2023.09.14: ✅ Integrate a patch-based sampling strategy (mixture-of-diffusers). Try it! Here is an example with a resolution of 2396 x 1596. GPU memory usage will continue to be optimized in the future and we are looking forward to your pull requests!
2023.09.14: ✅ Add support for background upsampler (DiffBIR/RealESRGAN) in face enhancement! :rocket: Try it!
2023.09.13: :rocket: Provide online demo (DiffBIR-official) in OpenXLab, which integrates both general model and face model. Please have a try! camenduru also implements an online demo, thanks for his work.:hugs:
2023.09.12: ✅ Upload inference code of latent image guidance and release real47 testset.
2023.09.08: ✅ Add support for restoring unaligned faces.
2023.09.06: :rocket: Update colab demo. Thanks to camenduru!:hugs:
2023.08.30: This repo is released.

<a name="visual_results"></a>:eyes:Visual Results On Real-world Images

Blind Image Super-Resolution

Blind Face Restoration

:star: Face and the background enhanced by DiffBIR.

Blind Image Denoising

8x Blind Super-Resolution With Tiled Sampling

I often think of Bag End. I miss my books and my arm chair, and my garden. See, that's where I belong. That's home. --- Bilbo Baggins

<a name="todo"></a>:climbing:TODO

[x] Release code and pretrained models :computer:.
[x] Update links to paper and project page :link:.
[x] Release real47 testset :minidisc:.
[ ] Provide webui.
[x] Reduce the vram usage of DiffBIR :fire::fire::fire:.
[ ] Provide HuggingFace demo :notebook:.
[x] Add a patch-based sampling schedule :mag:.
[x] Upload inference code of latent image guidance :page_facing_up:.
[x] Improve the performance :superhero:.
[x] Support MPS acceleration for MacOS users.
[ ] DiffBIR-turbo :fire::fire::fire:.
[x] Speed up inference, such as using fp16/bf16, torch.compile :fire::fire::fire:.

<a name="installation"></a>:gear:Installation

# clone this repo
git clone https://github.com/XPixelGroup/DiffBIR.git
cd DiffBIR

# create environment
conda create -n diffbir python=3.10
conda activate diffbir
pip install -r requirements.txt

Our new code is based on pytorch 2.2.2 for the built-in support of memory-efficient attention. If you are working on a GPU that is not compatible with the latest pytorch, just downgrade pytorch to 1.13.1+cu116 and install xformers 0.0.16 as an alternative.

<a name="quick_start"></a>:flight_departure:Quick Start

Run the following command to interact with the gradio website.

# For low-VRAM users, set captioner to ram or none
python run_gradio.py --captioner llava

<a name="pretrained_models"></a>:dna:Pretrained Models

Here we list pretrained weight of stage 2 model (IRControlNet) and our trained SwinIR, which was used for degradation removal during the training of stage 2 model.

Related Skills

node-connect

342.5k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

frontend-design

85.3k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

openai-whisper-api

342.5k

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

qqbot-media

342.5k

QQBot 富媒体收发能力。使用 <qqmedia> 标签，系统根据文件扩展名自动识别类型（图片/语音/视频/文件）。