<div align="center"> <h1>🎨 LucidFlux:<br/>Caption-Free Photo-Realistic Image Restoration via a Large-Scale Diffusion Transformer</h1>

🌐 Website | 📘 Arxiv | 📄 Technical Report | 🤗 Models | 🔧 Fal-AI Demo&API

</div>

<details open><summary>💡 We also have other projects on 4K text-to-image generation and RL-enhanced LucidFlux that may interest you. ✨</summary><p>

[CVPR 2026] UltraFlux: Data-Model Co-Design for High-quality Native 4K Text-to-Image Generation across Diverse Aspect Ratios <br> Tian Ye<sup>1</sup>*‡, Song Fei<sup>1</sup>*, Lei Zhu<sup>1,2</sup>† <br> <br>

LucidNFT: LR-Anchored Multi-Reward Preference Optimization for Generative Real-World Super-Resolution <br> Song Fei<sup>1,†</sup>, Tian Ye<sup>1,†</sup>, Sixiang Chen<sup>1</sup>, Zhaohu Xing<sup>1</sup>, Jianyu Lai<sup>1</sup>, Lei Zhu<sup>1,2,*</sup> <br> <br>

</p></details>

📰 News & Updates

[2026.03.19] - We released the training code for LucidFlux.

[2026.03.13] - LucidFlux now integrates UltraFlux's VAE to enable 2K image restoration! 🚀

[2026.03.10] - We released the metadata for the clean images used in LucidFlux at LucidFlux-Training-Data and the filtering pipeline in tools/filtering_pipeline.py.

[2026.02.06] - LucidFlux is accepted by ICLR'26.

[2025.10.07] — Thanks to smthemex for developing ComfyUI_LucidFlux, which enables LucidFlux to run with as little as 8 GB–12 GB of memory through the ComfyUI integration.

[2025.10.06] -- LucidFlux now supports offload and precomputed prompt embeddings, eliminating the need to load T5 or CLIP during inference. These improvements reduce memory usage significantly — inference can now run with as little as 28 GB VRAM, greatly enhancing deployment efficiency.

[2025.10.05] -- LucidFlux has been officially added to the Fal AI Playground! You can now try the online demo and access the Fal API directly here:
👉 LucidFlux on Fal AI

Let us know if this works!

👥 Authors

Song Fei<sup>1</sup>*, Tian Ye<sup>1</sup>*‡, Lujia Wang<sup>1</sup> , Lei Zhu<sup>1,2</sup>†

<sup>1</sup>The Hong Kong University of Science and Technology (Guangzhou)
<sup>2</sup>The Hong Kong University of Science and Technology

*Equal Contribution, ‡Project Leader, †Corresponding Author

🌟 What is LucidFlux?

LucidFlux is a caption-free universal image restoration framework that leverages a lightweight dual-branch conditioner and adaptive modulation to guide a large diffusion transformer (Flux.1) with minimal overhead, achieving robust, high-fidelity restoration without relying on text prompts or MLLM captions.

📊 Performance Benchmarks

📈 Quantitative Results

🎭 Gallery & Examples

🎨 LucidFlux Gallery

🔍 Comparison with Open-Source Methods

<table> <tr align="center"> <td width="200"><b>LQ</b></td> <td width="200"><b>SinSR</b></td> <td width="200"><b>SeeSR</b></td> <td width="200"><b>SUPIR</b></td> <td width="200"><b>DreamClear</b></td> <td width="200"><b>Ours</b></td> </tr> <tr align="center"><td colspan="6"><img src="images/comparison/040.jpg" width="1200"></td></tr> <tr align="center"><td colspan="6"><img src="images/comparison/041.jpg" width="1200"></td></tr> <tr align="center"><td colspan="6"><img src="images/comparison/111.jpg" width="1200"></td></tr> <tr align="center"><td colspan="6"><img src="images/comparison/123.jpg" width="1200"></td></tr> <tr align="center"><td colspan="6"><img src="images/comparison/160.jpg" width="1200"></td></tr> </table> <details> <summary>Show more examples</summary> <table> <tr align="center"><td colspan="6"><img src="images/comparison/013.jpg" width="1200"></td></tr> <tr align="center"><td colspan="6"><img src="images/comparison/079.jpg" width="1200"></td></tr> <tr align="center"><td colspan="6"><img src="images/comparison/082.jpg" width="1200"></td></tr> <tr align="center"><td colspan="6"><img src="images/comparison/137.jpg" width="1200"></td></tr> <tr align="center"><td colspan="6"><img src="images/comparison/166.jpg" width="1200"></td></tr> </table> </details>

💼 Comparison with Commercial Models

<table> <tr align="center"> <td width="200"><b>LQ</b></td> <td width="200"><b>HYPIR-FLUX</b></td> <td width="200"><b>Topaz</b></td> <td width="200"><b>Seedream 4.0</b></td> <td width="200"><b>MeiTu SR</b></td> <td width="200"><b>Gemini-NanoBanana</b></td> <td width="200"><b>Ours</b></td> </tr> <tr align="center"><td colspan="7"><img src="images/commercial_comparison/commercial_061.jpg" width="1400"></td></tr> <tr align="center"><td colspan="7"><img src="images/commercial_comparison/commercial_094.jpg" width="1400"></td></tr> <tr align="center"><td colspan="7"><img src="images/commercial_comparison/commercial_205.jpg" width="1400"></td></tr> <tr align="center"><td colspan="7"><img src="images/commercial_comparison/commercial_209.jpg" width="1400"></td></tr> </table> <details> <summary>Show more examples</summary> <table> <tr align="center"><td colspan="7"><img src="images/commercial_comparison/commercial_062.jpg" width="1400"></td></tr> <tr align="center"><td colspan="7"><img src="images/commercial_comparison/commercial_160.jpg" width="1400"></td></tr> <tr align="center"><td colspan="7"><img src="images/commercial_comparison/commercial_111.jpg" width="1400"></td></tr> <tr align="center"><td colspan="7"><img src="images/commercial_comparison/commercial_123.jpg" width="1400"></td></tr> </table> </details> </div>

🏗️ Model Architecture

<div align="center"> <img src="images/framework/framework.png" alt="LucidFlux Framework Overview" width="1200"/> <br> <em><strong>Caption-Free Universal Image Restoration with a Large-Scale Diffusion Transformer</strong></em> </div>

Our unified framework consists of four critical components in the training workflow:

🎨 Dual-Branch Conditioner for Low-Quality Image Conditioning

🎯 Timestep and Layer-Adaptive Condition Injection

🔄 Semantic Priors from Siglip for Caption-Free Semantic Alignment

🔤 Scaling Up Real-world High-Quality Data for Universal Image Restoration

🚀 Quick Start

⚠️ The default setup requires roughly 28 GB of GPU VRAM.

🔧 Installation

# Clone the repository
git clone https://github.com/W2GenAI-Lab/LucidFlux.git
cd LucidFlux

# Create conda environment
conda create -n lucidflux python=3.11
conda activate lucidflux

# Install PyTorch (CUDA 12.8 wheels)
pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu128

# Install remaining dependencies
pip install -r requirements.txt
pip install --upgrade timm

Inference

Prepare models in 2 steps, then run a single command.

python -m tools.hf_login --token "$HF_TOKEN"

Download required weights to fixed paths and export env vars

# FLUX.1-dev (flow+ae), SwinIR prior, T5, CLIP, SigLIP and LucidFlux checkpoint to ./weights
python -m tools.download_weights --dest weights

# Exports FLUX_DEV_FLOW/FLUX_DEV_AE to your shell (Linux/macOS)
source weights/env.sh

# Windows: open `weights\env.sh`, replace each leading `export` with `set`, then paste those commands into Command Prompt

Run inference (uses fixed relative paths):

bash infere

LucidFlux

Install / Use

README