DiffBIR
[ECCV 2024] codes of DiffBIR: Towards Blind Image Restoration with Generative Diffusion Prior
Install / Use
/learn @XPixelGroup/DiffBIRREADME
DiffBIR: Towards Blind Image Restoration with Generative Diffusion Prior
Xinqi Lin<sup>1,*</sup>, Jingwen He<sup>2,3,*</sup>, Ziyan Chen<sup>1</sup>, Zhaoyang Lyu<sup>2</sup>, Bo Dai<sup>2</sup>, Fanghua Yu<sup>1</sup>, Wanli Ouyang<sup>2</sup>, Yu Qiao<sup>2</sup>, Chao Dong<sup>1,2</sup>
<sup>1</sup>Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences<br><sup>2</sup>Shanghai AI Laboratory<br><sup>3</sup>The Chinese University of Hong Kong
<p align="center"> <img src="assets/teaser.png"> </p><p align="center"> <img src="assets/pipeline.png"> </p>
:star:If DiffBIR is helpful for you, please help star this repo. Thanks!:hugs:
:book:Table Of Contents
- Update
- Visual Results On Real-world Images
- TODO
- Installation
- Quick Start
- Pretrained Models
- Inference
- Train
<a name="update"></a>:new:Update
- 2025.07.29: ✅ We've released our new work HYPIR! It significantly outperforms DiffBIR🔥🔥🔥, while also being tens of times faster🔥🔥🔥. We welcome you to try it out.
- 2024.11.27: ✅ Release DiffBIR v2.1, including a new model trained on unsplash dataset with LLaVA-generated captions, more samplers, better tiled-sampling support and so on. Check release note for details.
- 2024.04.08: ✅ Release everything about our updated manuscript, including (1) a new model trained on subset of laion2b-en and (2) a more readable code base, etc. DiffBIR is now a general restoration pipeline that could handle different blind image restoration tasks with a unified generation module.
- 2023.09.19: ✅ Add support for Apple Silicon! Check installation_xOS.md to work with CPU/CUDA/MPS device!
- 2023.09.14: ✅ Integrate a patch-based sampling strategy (mixture-of-diffusers). Try it! Here is an example with a resolution of 2396 x 1596. GPU memory usage will continue to be optimized in the future and we are looking forward to your pull requests!
- 2023.09.14: ✅ Add support for background upsampler (DiffBIR/RealESRGAN) in face enhancement! :rocket: Try it!
- 2023.09.13: :rocket: Provide online demo (DiffBIR-official) in OpenXLab, which integrates both general model and face model. Please have a try! camenduru also implements an online demo, thanks for his work.:hugs:
- 2023.09.12: ✅ Upload inference code of latent image guidance and release real47 testset.
- 2023.09.08: ✅ Add support for restoring unaligned faces.
- 2023.09.06: :rocket: Update colab demo. Thanks to camenduru!:hugs:
- 2023.08.30: This repo is released.
<a name="visual_results"></a>:eyes:Visual Results On Real-world Images
Blind Image Super-Resolution
<img src="assets/visual_results/bsr6.png" height="223px"/> <img src="assets/visual_results/bsr7.png" height="223px"/> <img src="assets/visual_results/bsr4.png" height="223px"/>
<!-- [<img src="assets/visual_results/bsr1.png" height="223px"/>](https://imgsli.com/MTk5ODIy) [<img src="assets/visual_results/bsr2.png" height="223px"/>](https://imgsli.com/MTk5ODIz) [<img src="assets/visual_results/bsr3.png" height="223px"/>](https://imgsli.com/MTk5ODI0) [<img src="assets/visual_results/bsr5.png" height="223px"/>](https://imgsli.com/MjAxMjM0) --> <!-- [<img src="assets/visual_results/bsr1.png" height="223px"/>](https://imgsli.com/MTk5ODIy) [<img src="assets/visual_results/bsr5.png" height="223px"/>](https://imgsli.com/MjAxMjM0) -->Blind Face Restoration
<!-- [<img src="assets/visual_results/bfr1.png" height="223px"/>](https://imgsli.com/MTk5ODI5) [<img src="assets/visual_results/bfr2.png" height="223px"/>](https://imgsli.com/MTk5ODMw) [<img src="assets/visual_results/bfr4.png" height="223px"/>](https://imgsli.com/MTk5ODM0) --><img src="assets/visual_results/whole_image1.png" height="370"/> <img src="assets/visual_results/whole_image2.png" height="370"/>
:star: Face and the background enhanced by DiffBIR.
Blind Image Denoising
<img src="assets/visual_results/bid1.png" height="215px"/> <img src="assets/visual_results/bid3.png" height="215px"/> <img src="assets/visual_results/bid2.png" height="215px"/>
8x Blind Super-Resolution With Tiled Sampling
I often think of Bag End. I miss my books and my arm chair, and my garden. See, that's where I belong. That's home. --- Bilbo Baggins
<img src="assets/visual_results/tiled_sampling.png" height="480px"/>
<a name="todo"></a>:climbing:TODO
- [x] Release code and pretrained models :computer:.
- [x] Update links to paper and project page :link:.
- [x] Release real47 testset :minidisc:.
- [ ] Provide webui.
- [x] Reduce the vram usage of DiffBIR :fire::fire::fire:.
- [ ] Provide HuggingFace demo :notebook:.
- [x] Add a patch-based sampling schedule :mag:.
- [x] Upload inference code of latent image guidance :page_facing_up:.
- [x] Improve the performance :superhero:.
- [x] Support MPS acceleration for MacOS users.
- [ ] DiffBIR-turbo :fire::fire::fire:.
- [x] Speed up inference, such as using fp16/bf16, torch.compile :fire::fire::fire:.
<a name="installation"></a>:gear:Installation
# clone this repo
git clone https://github.com/XPixelGroup/DiffBIR.git
cd DiffBIR
# create environment
conda create -n diffbir python=3.10
conda activate diffbir
pip install -r requirements.txt
Our new code is based on pytorch 2.2.2 for the built-in support of memory-efficient attention. If you are working on a GPU that is not compatible with the latest pytorch, just downgrade pytorch to 1.13.1+cu116 and install xformers 0.0.16 as an alternative.
<!-- Note the installation is only compatible with **Linux** users. If you are working on different platforms, please check [xOS Installation](assets/docs/installation_xOS.md). --><a name="quick_start"></a>:flight_departure:Quick Start
Run the following command to interact with the gradio website.
# For low-VRAM users, set captioner to ram or none
python run_gradio.py --captioner llava
<div align="center">
<kbd><img src="assets/gradio.png"></img></kbd>
</div>
<a name="pretrained_models"></a>:dna:Pretrained Models
Here we list pretrained weight of stage 2 model (IRControlNet) and our trained SwinIR, which was used for degradation removal during the training of stage 2 model.
| Model Name | Description | HuggingFace | BaiduNetdisk | OpenXLab | | :---------: | :----------: | :----------: | :----------: | :----------: | | v2.1.pt | IRControlNet trained on filtered unsplash | download | N/A | N/A | | v2.pth | IRControlNet trained on filtered laion2b-en | download | download<br>(pwd: xiu3) | download | | v1_general.pth | IRControlNet trained on ImageNet-1k | download | download<br>(pwd: 79n9) | download | | v1_face.pth | IRControlNet trained on FFHQ | download | download<br>(pwd: n7dx) | download | | codeformer_swinir.ckpt | SwinIR trained on ImageNet-1k with CodeFormer degradation | download | download<br>(pwd: vfif) | download | | realesrgan_s4_swinir_100k.pth | SwinIR trained on ImageNet-1k with Real-ESRGAN degradation | download | N/A | N
Related Skills
node-connect
342.5kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
85.3kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
342.5kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
342.5kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。