FlowIE
This repository contains the official implementation of "FlowIE: Efficient Image Enhancement via Rectified Flow"
Install / Use
/learn @EternalEvan/FlowIEREADME
FlowIE: Efficient Image Enhancement via Rectified Flow (CVPR 2024)
Yixuan Zhu*, Wenliang Zhao* $\dagger$, Ao Li, Yansong Tang, Jie Zhou, Jiwen Lu $\ddagger$
* Equal contribution $\dagger$ Project leader $\ddagger$ Corresponding author
The repository contains the official implementation for the paper "FlowIE: Efficient Image Enhancement via Rectified Flow" (CVPR 2024, oral presentation).
FlowIE is a simple yet highly effective <ins>Flow</ins>-based <ins>I</ins>mage <ins>E</ins>nhancement framework that estimates straight-line paths from an elementary distribution to high-quality images.
📋 To-Do List
- [x] Release model and inference code.
- [x] Release code for training dataloader.
💡 Pipeline

😀Quick Start
⚙️ 1. Installation
We recommend you to use an Anaconda virtual environment. If you have installed Anaconda, run the following commands to create and activate a virtual environment.
conda env create -f requirements.txt
conda activate FlowIE
📑 2. Modify the lora configuration
Since we use MemoryEfficientCrossAttention to accelerate the inference process, we need to slightly modify the lora.py in lora_diffusion package, which could be done in 2 minutes:
- (1) Locate the
lora.pyfile in the package directory. You can easily find this file by using the "go to definition" button in Line 4 of the./model/cldm.pyfile. - (2) Make the following modifications to Lines 159-161 in
lora.py:
Original Code:
UNET_DEFAULT_TARGET_REPLACE = {"CrossAttention", "Attention", "GEGLU"}
UNET_EXTENDED_TARGET_REPLACE = {"ResnetBlock2D", "CrossAttention", "Attention", "GEGLU"}
Modified Code:
UNET_DEFAULT_TARGET_REPLACE = {"CrossAttention", "Attention", "GEGLU", "MemoryEfficientCrossAttention"}
UNET_EXTENDED_TARGET_REPLACE = {"ResnetBlock2D", "CrossAttention", "Attention", "GEGLU", "MemoryEfficientCrossAttention", "ResBlock"}
💾 2. Data Preparation
We prepare the data in a samilar way as GFPGAN & DiffBIR. We list the datasets for BFR and BSR as follows:
For BFR evaluation, please refer to here for BFR-test datasets, which include CelebA-Test, CelebChild-Test and LFW-Test. The WIDER-Test can be found in here. For BFR training, please download the FFHQ dataset.
For BSR, we utilize ImageNet for training. For evaluation, you can refer to BSRGAN for RealSRSet.
To prepare the training list, you need to simply run the script:
python ./scripts/make_file_list.py --img_folder /data/ILSVRC2012 --save_folder ./dataset/list/imagenet
python ./scripts/make_file_list.py --img_folder /data/FFHQ --save_folder ./dataset/list/ffhq
The file list looks like this:
/path/to/image_1.png
/path/to/image_2.png
/path/to/image_3.png
...
🗂️ 3. Download Checkpoints
Please download our pretrained checkpoints from this link and put them under ./weights. The file directory should be:
|-- checkpoints
|--|-- FlowIE_bfr_v1.ckpt
|--|-- FlowIE_bsr_v1.ckpt
...
📊 4. Test & Evaluation
You can test FlowIE with following commands:
- Evaluation for BFR
python inference_bfr.py --ckpt ./weights/FlowIE_bfr_v1.ckpt --has_aligned --input /data/celeba_512_validation_lq/ --output ./outputs/bfr_exp --has_aligned
- Evaluation for BSR
python inference_bsr.py --ckpt ./weights/FlowIE_bsr_v1.ckpt --input /data/testdata/ --output ./outputs/bsr_exp --sr_scale 4
- Quick Test
For a quick test, we collect some test samples in ./assets. You can run the demo for BFR:
python inference_bfr.py --ckpt ./weights/FlowIE_bfr_v1.ckpt --input ./assets/faces --output ./outputs/demo
And for BSR:
python inference_bsr.py --ckpt ./weights/FlowIE_bsr_v1.pth --input ./assets/real-photos/ --output ./outputs/bsr_exp --tiled --sr_scale 4
You can use --tiled for patch-based inference and use --sr_scale tp set the super-resolution scale, like 2 or 4. You can set CUDA_VISIBLE_DEVICES=1 to choose the devices.
The evaluation process can be done with one Nvidia GeForce RTX 3090 GPU (24GB VRAM). You can use more GPUs by specifying the GPU ids.
🔥 5. Training
The key component in FlowIE is a path estimator tuned from Stable Diffusion v2.1 base. Please download it to ./weights. Another part is the initial module, which can be found in checkpoints.
Before training, you also need to configure training-related information in ./configs/train_cldm.yaml. Then run this command to start training:
python train.py --config ./configs/train_cldm.yaml
<!--**For evaluation only, you can just prepare 3DPW dataset.**-->
<!--


```
|-- common
| |-- utils
| | |-- human_model_files
| | |-- smplpytorch
|-- data
| |-- J_regressor_extra.npy
| |-- 3DPW
| | |-- 3DPW_latest_test.json
| | |-- 3DPW_oc.json
| | |-- 3DPW_pc.json
| | |-- 3DPW_validation_crowd_hhrnet_result.json
| | |-- imageFiles
| | |-- sequenceFiles
```-->
🫰 Acknowledgments
We would like to express our sincere thanks to the author of DiffBIR for the clear code base and quick response to our issues.
We also thank CodeFormer, Real-ESRGAN and LoRA, for our code is partially borrowing from them.
The new version of FlowIE based on Denoising Transformer (DiT) structure will be released soon! Thanks the newest works of DiTs, including PixART and Stable Diffusion 3.
🔖 Citation
Please cite us if our work is useful for your research.
@misc{zhu2024flowie,
title={FlowIE: Efficient Image Enhancement via Rectified Flow},
author={Yixuan Zhu and Wenliang Zhao and Ao Li and Yansong Tang and Jie Zhou and Jiwen Lu},
year={2024},
eprint={2406.00508},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
🔑 License
This code is distributed under an MIT LICENSE.
Related Skills
node-connect
333.7kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
82.0kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
333.7kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
commit-push-pr
82.0kCommit, push, and open a PR
