LucidFlux
LucidFlux: Caption-Free Photo-Realistic Image Restoration via a Large-Scale Diffusion Transformer, ICLR 2026
Install / Use
/learn @W2GenAI-Lab/LucidFluxREADME
🌐 Website | 📘 Arxiv | 📄 Technical Report | 🤗 Models | 🔧 Fal-AI Demo&API
</div><img alt="abs_image" src="images/framework/abs_image.png" />
<details open><summary>💡 We also have other projects on 4K text-to-image generation and RL-enhanced LucidFlux that may interest you. ✨</summary><p>
[CVPR 2026] UltraFlux: Data-Model Co-Design for High-quality Native 4K Text-to-Image Generation across Diverse Aspect Ratios <br> Tian Ye<sup>1</sup>*‡, Song Fei<sup>1</sup>*, Lei Zhu<sup>1,2</sup>† <br>
![]()
![]()
![]()
![]()
<br>
</p></details>LucidNFT: LR-Anchored Multi-Reward Preference Optimization for Generative Real-World Super-Resolution <br> Song Fei<sup>1,†</sup>, Tian Ye<sup>1,†</sup>, Sixiang Chen<sup>1</sup>, Zhaohu Xing<sup>1</sup>, Jianyu Lai<sup>1</sup>, Lei Zhu<sup>1,2,*</sup> <br>
![]()
![]()
![]()
![]()
<br>
📰 News & Updates
[2026.03.19] - We released the training code for LucidFlux.
[2026.03.13] - LucidFlux now integrates UltraFlux's VAE to enable 2K image restoration! 🚀
[2026.03.10] - We released the metadata for the clean images used in LucidFlux at LucidFlux-Training-Data and the filtering pipeline in tools/filtering_pipeline.py.
[2026.02.06] - LucidFlux is accepted by ICLR'26.
[2025.10.07] — Thanks to smthemex for developing ComfyUI_LucidFlux, which enables LucidFlux to run with as little as 8 GB–12 GB of memory through the ComfyUI integration.
[2025.10.06] -- LucidFlux now supports offload and precomputed prompt embeddings, eliminating the need to load T5 or CLIP during inference. These improvements reduce memory usage significantly — inference can now run with as little as 28 GB VRAM, greatly enhancing deployment efficiency.
[2025.10.05] -- LucidFlux has been officially added to the Fal AI Playground! You can now try the online demo and access the Fal API directly here:
👉 LucidFlux on Fal AI
Let us know if this works!
👥 Authors
Song Fei<sup>1</sup>*, Tian Ye<sup>1</sup>*‡, Lujia Wang<sup>1</sup> , Lei Zhu<sup>1,2</sup>†
<sup>1</sup>The Hong Kong University of Science and Technology (Guangzhou)
<sup>2</sup>The Hong Kong University of Science and Technology*Equal Contribution, ‡Project Leader, †Corresponding Author
🌟 What is LucidFlux?
LucidFlux is a caption-free universal image restoration framework that leverages a lightweight dual-branch conditioner and adaptive modulation to guide a large diffusion transformer (Flux.1) with minimal overhead, achieving robust, high-fidelity restoration without relying on text prompts or MLLM captions.
📊 Performance Benchmarks
<div align="center">📈 Quantitative Results
<img alt="quantitative_comparison" src="images/framework/quantitative_comparison.png" /> <img alt="quantitative_comparison_commercial" src="images/framework/quantitative_comparison_commercial.png" /> </div>🎭 Gallery & Examples
<div align="center">🎨 LucidFlux Gallery
🔍 Comparison with Open-Source Methods
<table> <tr align="center"> <td width="200"><b>LQ</b></td> <td width="200"><b>SinSR</b></td> <td width="200"><b>SeeSR</b></td> <td width="200"><b>SUPIR</b></td> <td width="200"><b>DreamClear</b></td> <td width="200"><b>Ours</b></td> </tr> <tr align="center"><td colspan="6"><img src="images/comparison/040.jpg" width="1200"></td></tr> <tr align="center"><td colspan="6"><img src="images/comparison/041.jpg" width="1200"></td></tr> <tr align="center"><td colspan="6"><img src="images/comparison/111.jpg" width="1200"></td></tr> <tr align="center"><td colspan="6"><img src="images/comparison/123.jpg" width="1200"></td></tr> <tr align="center"><td colspan="6"><img src="images/comparison/160.jpg" width="1200"></td></tr> </table> <details> <summary>Show more examples</summary> <table> <tr align="center"><td colspan="6"><img src="images/comparison/013.jpg" width="1200"></td></tr> <tr align="center"><td colspan="6"><img src="images/comparison/079.jpg" width="1200"></td></tr> <tr align="center"><td colspan="6"><img src="images/comparison/082.jpg" width="1200"></td></tr> <tr align="center"><td colspan="6"><img src="images/comparison/137.jpg" width="1200"></td></tr> <tr align="center"><td colspan="6"><img src="images/comparison/166.jpg" width="1200"></td></tr> </table> </details>💼 Comparison with Commercial Models
<table> <tr align="center"> <td width="200"><b>LQ</b></td> <td width="200"><b>HYPIR-FLUX</b></td> <td width="200"><b>Topaz</b></td> <td width="200"><b>Seedream 4.0</b></td> <td width="200"><b>MeiTu SR</b></td> <td width="200"><b>Gemini-NanoBanana</b></td> <td width="200"><b>Ours</b></td> </tr> <tr align="center"><td colspan="7"><img src="images/commercial_comparison/commercial_061.jpg" width="1400"></td></tr> <tr align="center"><td colspan="7"><img src="images/commercial_comparison/commercial_094.jpg" width="1400"></td></tr> <tr align="center"><td colspan="7"><img src="images/commercial_comparison/commercial_205.jpg" width="1400"></td></tr> <tr align="center"><td colspan="7"><img src="images/commercial_comparison/commercial_209.jpg" width="1400"></td></tr> </table> <details> <summary>Show more examples</summary> <table> <tr align="center"><td colspan="7"><img src="images/commercial_comparison/commercial_062.jpg" width="1400"></td></tr> <tr align="center"><td colspan="7"><img src="images/commercial_comparison/commercial_160.jpg" width="1400"></td></tr> <tr align="center"><td colspan="7"><img src="images/commercial_comparison/commercial_111.jpg" width="1400"></td></tr> <tr align="center"><td colspan="7"><img src="images/commercial_comparison/commercial_123.jpg" width="1400"></td></tr> </table> </details> </div>🏗️ Model Architecture
<div align="center"> <img src="images/framework/framework.png" alt="LucidFlux Framework Overview" width="1200"/> <br> <em><strong>Caption-Free Universal Image Restoration with a Large-Scale Diffusion Transformer</strong></em> </div>Our unified framework consists of four critical components in the training workflow:
🎨 Dual-Branch Conditioner for Low-Quality Image Conditioning
🎯 Timestep and Layer-Adaptive Condition Injection
🔄 Semantic Priors from Siglip for Caption-Free Semantic Alignment
🔤 Scaling Up Real-world High-Quality Data for Universal Image Restoration
🚀 Quick Start
⚠️ The default setup requires roughly 28 GB of GPU VRAM.
🔧 Installation
# Clone the repository
git clone https://github.com/W2GenAI-Lab/LucidFlux.git
cd LucidFlux
# Create conda environment
conda create -n lucidflux python=3.11
conda activate lucidflux
# Install PyTorch (CUDA 12.8 wheels)
pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu128
# Install remaining dependencies
pip install -r requirements.txt
pip install --upgrade timm
Inference
Prepare models in 2 steps, then run a single command.
- Login to Hugging Face (required for gated FLUX.1-dev). Skip if already logged-in.
python -m tools.hf_login --token "$HF_TOKEN"
- Download required weights to fixed paths and export env vars
# FLUX.1-dev (flow+ae), SwinIR prior, T5, CLIP, SigLIP and LucidFlux checkpoint to ./weights
python -m tools.download_weights --dest weights
# Exports FLUX_DEV_FLOW/FLUX_DEV_AE to your shell (Linux/macOS)
source weights/env.sh
# Windows: open `weights\env.sh`, replace each leading `export` with `set`, then paste those commands into Command Prompt
Run inference (uses fixed relative paths):
bash infere
