DiffLens
Official code for "DiffLens: Dissecting and Mitigating Diffusion Bias via Mechanistic Interpretability" (CVPR 2025)
Install / Use
/learn @foundation-model-research/DiffLensREADME
This is the official repository of DiffLens: Dissecting and Mitigating Diffusion Bias via Mechanistic Interpretability (CVPR2025)
Update
[2025.06.13] Release code.
Pipeline
TL;DR: A method to dissect and mitigate social bias in diffusion models by identifying and intervening in key bias-generating features within neuron activities.
Installation
Please follow the steps below to perform the installation:
1. Create virtual environment
conda create -n difflens python=3.9
conda activate difflens
2. Install packages
pip install -r requirements.txt
Download models and datasets
1. P2 model.
Download the CelebA-HQ checkpoint from P2 weighting (CVPR 2022), and then put the checkpoint into a folder pretrained_model/P2.
2. Stable Diffusion v1.5
Download the model from Stable Diffusion v1.5.
3. Download Fairface checkpoint to evaluate FD metric.
Download the res34_fair_align_multi_7_20190809.pt checkpoint from FairFace (WACV 2021), and then put the checkpoint into a folder evaluation/Fairface.
If you want to evaluate the images generated by Stable Diffusion v1.5, you need download dlib checkpoints from FairFace (WACV 2021).
Prepare latent dataset for P2 model
You can skip this step and proceed directly to the "Bias Mitigation" step because we have provided ready-to-use resources.
cd ./utils/P2
# for training h-classifier
python prepare_h_classifier_latents.py
# for locating features
python prepare_locating_latents.py
Dissecting Bias Mechanism
[!TIP] For P2, You can skip this step and proceed directly to the "Bias Mitigation" step, because we have provided ready-to-use resources in Checkpints (Google Drive). For Stable Diffsion v1.5, you can skip both "Train SAE" and "Train h-classifier" steps.
1. Train SAE
# You can also use torchrun to start pytorch DDP
# Set train_sae.train to True in ./config/P2/example_P2.yaml.
python -m difflens ./config/P2/example_P2.yaml
Then the SAE checkpoint will be saved into sae-ckpts.
2. Train h-classifier
# set train_h_classifier.train to True in ./config/P2/example_P2.yaml.
python -m difflens ./config/P2/train_h_classifier/h_classifier_config.yaml
3. Locate features
# set locate.locate to True in ./config/P2/example_P2.yaml.
python -m difflens ./config/P2/example_P2.yaml
After locating features, you need aggregate features across diffusion timesteps.
cd ./utils/P2
python aggregate_features.py
Bias Mitigation
We provide an example of P2 model. You can download SAE checkpoint and features from Checkpints (Google Drive)
1. Generate
You can generate images using following commands.
# Set bias_mitigation.generate to True in ./config/P2/example_P2.yaml.
python -m difflens ./config/P2/example_P2.yaml
Some parameters in ./config/P2/bias_mitigation/generate_config_age.yaml
target_attr: age (Choose one fromage,genderandrace)top_k_list: 30_30_30 (e.g.genderhas two classes,maleandfemale, and thetop_k_listshould be30_30)edit_ratios: 1.0_1.0_3.0 (e.g.genderhas two classes,maleandfemale, and theedit_ratiosshould be1.0_1.0)edit_method: multiply_all_probability (Choose one frommultiply_all,multiply_all_probability,add_all_probabilityandadd_all_probability.multiply_allmeansScalingfeatures regardless pos in images.add_allmeansAddingfeatures regardless pos in images.)
2. Evaluate
- Fairness Discrepancy:
For FD metric, you can use Fairface to test your images. Here is an example of
age.
If you want to evaluate images generated by Stable Diffusion v1.5, you need to crop face using ./evaluation/crop_face/crop.py first.
cd ./evaluation/Fairface
python age.py
Edit file_lists if you want to change the path of images.
- CLIP-I:
We provide a script
clip_image_score.pyfor CLIP-I.
cd ./evaluation/CLIP
python clip_image_score.py
- FID:
We use
clean-fidto calculate FID.
cd ./evaluation/FID
python fid.py
Citation
@InProceedings{Shi_2025_CVPR,
author = {Shi, Yingdong and Li, Changming and Wang, Yifan and Zhao, Yongxiang and Pang, Anqi and Yang, Sibei and Yu, Jingyi and Ren, Kan},
title = {Dissecting and Mitigating Diffusion Bias via Mechanistic Interpretability},
booktitle = {Proceedings of the Computer Vision and Pattern Recognition Conference (CVPR)},
month = {June},
year = {2025},
pages = {8192-8202}
}
