FlowIE: Efficient Image Enhancement via Rectified Flow (CVPR 2024)

Yixuan Zhu*, Wenliang Zhao* $\dagger$, Ao Li, Yansong Tang, Jie Zhou, Jiwen Lu $\ddagger$

* Equal contribution $\dagger$ Project leader $\ddagger$ Corresponding author

[Paper]

The repository contains the official implementation for the paper "FlowIE: Efficient Image Enhancement via Rectified Flow" (CVPR 2024, oral presentation).

FlowIE is a simple yet highly effective <ins>Flow</ins>-based <ins>I</ins>mage <ins>E</ins>nhancement framework that estimates straight-line paths from an elementary distribution to high-quality images.

📋 To-Do List

[x] Release model and inference code.
[x] Release code for training dataloader.

💡 Pipeline

😀Quick Start

⚙️ 1. Installation

We recommend you to use an Anaconda virtual environment. If you have installed Anaconda, run the following commands to create and activate a virtual environment.

conda env create -f requirements.txt
conda activate FlowIE

📑 2. Modify the lora configuration

Since we use MemoryEfficientCrossAttention to accelerate the inference process, we need to slightly modify the lora.py in lora_diffusion package, which could be done in 2 minutes:

(1) Locate the lora.py file in the package directory. You can easily find this file by using the "go to definition" button in Line 4 of the ./model/cldm.py file.
(2) Make the following modifications to Lines 159-161 in lora.py:

Original Code:

UNET_DEFAULT_TARGET_REPLACE = {"CrossAttention", "Attention", "GEGLU"}
UNET_EXTENDED_TARGET_REPLACE = {"ResnetBlock2D", "CrossAttention", "Attention", "GEGLU"}

Modified Code:

UNET_DEFAULT_TARGET_REPLACE = {"CrossAttention", "Attention", "GEGLU", "MemoryEfficientCrossAttention"}
UNET_EXTENDED_TARGET_REPLACE = {"ResnetBlock2D", "CrossAttention", "Attention", "GEGLU", "MemoryEfficientCrossAttention", "ResBlock"}

💾 2. Data Preparation

We prepare the data in a samilar way as GFPGAN & DiffBIR. We list the datasets for BFR and BSR as follows:

For BFR evaluation, please refer to here for BFR-test datasets, which include CelebA-Test, CelebChild-Test and LFW-Test. The WIDER-Test can be found in here. For BFR training, please download the FFHQ dataset.

For BSR, we utilize ImageNet for training. For evaluation, you can refer to BSRGAN for RealSRSet.

To prepare the training list, you need to simply run the script:

python ./scripts/make_file_list.py --img_folder /data/ILSVRC2012  --save_folder ./dataset/list/imagenet
python ./scripts/make_file_list.py --img_folder /data/FFHQ  --save_folder ./dataset/list/ffhq

The file list looks like this:

/path/to/image_1.png
/path/to/image_2.png
/path/to/image_3.png
...

🗂️ 3. Download Checkpoints

Please download our pretrained checkpoints from this link and put them under ./weights. The file directory should be:

|-- checkpoints
|--|-- FlowIE_bfr_v1.ckpt
|--|-- FlowIE_bsr_v1.ckpt
...

📊 4. Test & Evaluation

You can test FlowIE with following commands:

Evaluation for BFR

python inference_bfr.py --ckpt ./weights/FlowIE_bfr_v1.ckpt --has_aligned  --input /data/celeba_512_validation_lq/  --output ./outputs/bfr_exp --has_aligned

Evaluation for BSR

python inference_bsr.py --ckpt ./weights/FlowIE_bsr_v1.ckpt  --input /data/testdata/  --output ./outputs/bsr_exp --sr_scale 4

Quick Test

For a quick test, we collect some test samples in ./assets. You can run the demo for BFR:

python inference_bfr.py --ckpt ./weights/FlowIE_bfr_v1.ckpt  --input ./assets/faces --output ./outputs/demo

And for BSR:

python inference_bsr.py --ckpt ./weights/FlowIE_bsr_v1.pth  --input ./assets/real-photos/  --output ./outputs/bsr_exp --tiled --sr_scale 4

You can use --tiled for patch-based inference and use --sr_scale tp set the super-resolution scale, like 2 or 4. You can set CUDA_VISIBLE_DEVICES=1 to choose the devices.

The evaluation process can be done with one Nvidia GeForce RTX 3090 GPU (24GB VRAM). You can use more GPUs by specifying the GPU ids.

🔥 5. Training

The key component in FlowIE is a path estimator tuned from Stable Diffusion v2.1 base. Please download it to ./weights. Another part is the initial module, which can be found in checkpoints.

Before training, you also need to configure training-related information in ./configs/train_cldm.yaml. Then run this command to start training:

python train.py --config ./configs/train_cldm.yaml

🫰 Acknowledgments

We would like to express our sincere thanks to the author of DiffBIR for the clear code base and quick response to our issues.

We also thank CodeFormer, Real-ESRGAN and LoRA, for our code is partially borrowing from them.

The new version of FlowIE based on Denoising Transformer (DiT) structure will be released soon! Thanks the newest works of DiTs, including PixART and Stable Diffusion 3.

🔖 Citation

Please cite us if our work is useful for your research.

@misc{zhu2024flowie,
      title={FlowIE: Efficient Image Enhancement via Rectified Flow}, 
      author={Yixuan Zhu and Wenliang Zhao and Ao Li and Yansong Tang and Jie Zhou and Jiwen Lu},
      year={2024},
      eprint={2406.00508},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

🔑 License

This code is distributed under an MIT LICENSE.

FlowIE

Install / Use

README