SkillAgentSearch skills...

BokehDiffusion

[SIGGRAPH Asia 2025] Official code for "Bokeh Diffusion: Defocus Blur Control in Text-to-Image Diffusion Models."

Install / Use

/learn @atfortes/BokehDiffusion
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

<div align="center"> <h1><i>Bokeh Diffusion</i>: Defocus Blur Control in Text-to-Image Diffusion Models</h1>

Armando FortesTianyi WeiShangchen ZhouXingang Pan

S-lab, Nanyang Technological University

Project Page arXiv Dataset Model

SIGGRAPH Asia 2025


Bokeh Diffusion enables precise, scene-consistent bokeh transitions in text-to-image diffusion models

teaser

🎥 For more visual results, check out our <a href="https://atfortes.github.io/projects/bokeh-diffusion/">project page</a>.

</div>

📮 Update

  • [2025.09] The model checkpoint and inference code are released.
  • [2025.08] Bokeh Diffusion is conditionally accepted at SIGGRAPH Asia 2025! 😄🎉
  • [2025.03] This repo is created.

🚧 TODO

  • [X] Release Dataset
  • [X] Release Model Weights
  • [X] Release Inference Code
  • [ ] Release Training Code

⚙️ Installation

Our environment has been tested on CUDA 12.6.

git clone https://github.com/atfortes/BokehDiffusion.git
cd BokehDiffusion

conda create -n bokehdiffusion -c conda-forge python=3.10
conda activate bokehdiffusion
pip install torch==2.6.0 torchvision==0.21.0 torchaudio==2.6.0 --index-url https://download.pytorch.org/whl/cu126
pip install flash-attn==2.7.4.post1 --no-build-isolation
pip install -r requirements.txt

💡 Quick Start

Unbounded image generation from text and bokeh level input:

python inference_flux.py \
    --prompt "a well-loved book lies forgotten on a park bench beneath a towering tree, its pages gently ruffling in the wind" \
    --bokeh_target 15.0

Grounded image generation for scene-consistency:

python inference_flux.py \
    --prompt "a well-loved book lies forgotten on a park bench beneath a towering tree, its pages gently ruffling in the wind" \
    --bokeh_target 0.0 4.0 8.0 12.0 18.0 28.0 \
    --bokeh_pivot 15.0 \
    --num_grounding_steps 24

Refer to the inference script for further input options (e.g., seed, inference steps, guidance scale).

📑 Citation

If you find our work useful, please cite the following paper:

@article{fortes2025bokeh,
    title     = {Bokeh Diffusion: Defocus Blur Control in Text-to-Image Diffusion Models},
    author    = {Fortes, Armando and Wei, Tianyi and Zhou, Shangchen and Pan, Xingang},
    journal   = {arXiv preprint arXiv:2503.08434},
    year      = {2025},
}

©️ License

This project is licensed under NTU S-Lab License 1.0. Redistribution and use should follow this license.

🤝 Acknowledgements

We would like to thank the following projects that made this work possible:

  • Megalith-10M is used as the base dataset for collecting real in-the-wild photographs.
  • BokehMe provides the synthetic blur rendering engine for generating defocus augmentations.
  • Depth-Pro is used to estimate metric depth maps.
  • RMBG v2.0 is used to generate foreground masks.
  • FLUX & Realistic-Vision & Cyber-Realistic are used as the base models for generating the samples in the paper.
View on GitHub
GitHub Stars121
CategoryDevelopment
Updated2d ago
Forks7

Languages

Python

Security Score

85/100

Audited on Apr 1, 2026

No findings