[CVPR 2024] MACE: Mass Concept Erasure in Diffusion Models

Official implementation of MACE: Mass Concept Erasure in Diffusion Models.

Our other works on Concept Erasing/Unlearning: <br>

Set You Straight: Auto-Steering Denoising Trajectories to Sidestep Unwanted Concepts <br> Leyang Li*, Shilin Lu*, Yan Ren, Adams Wai-Kin Kong <br> [arXiv] [code]

MACE: Mass Concept Erasure in Diffusion Models<br>

Shilin Lu, Zilan Wang, Leyang Li, Yanzhu Liu, Adams Wai-Kin Kong <br> CVPR 2024

Abstract: <br> The rapid expansion of large-scale text-to-image diffusion models has raised growing concerns regarding their potential misuse in creating harmful or misleading content. In this paper, we introduce MACE, a finetuning framework for the task of mass concept erasure. This task aims to prevent models from generating images that embody unwanted concepts when prompted. Existing concept erasure methods are typically restricted to handling fewer than five concepts simultaneously and struggle to find a balance between erasing concept synonyms (generality) and maintaining unrelated concepts (specificity). In contrast, MACE differs by successfully scaling the erasure scope up to 100 concepts and by achieving an effective balance between generality and specificity. This is achieved by leveraging closed-form cross-attention refinement along with LoRA finetuning, collectively eliminating the information of undesirable concepts. Furthermore, MACE integrates multiple LoRAs without mutual interference. We conduct extensive evaluations of MACE against prior methods across four different tasks: object erasure, celebrity erasure, explicit content erasure, and artistic style erasure. Our results reveal that MACE surpasses prior methods in all evaluated tasks.

teaser

</div>

framework

(a) Our framework focuses on tuning the prompts-related projection matrices within cross-attention (CA) blocks. (b) The pretrained U-Net's CA blocks are refined using a closed-form solution, discouraging the model from embedding the residual information of the target phrase into surrounding words. (c) For each concept targeted for removal, a distinct LoRA module is learned to eliminate its intrinsic information. (d) A closed-form solution is introduced to integrate multiple LoRA modules without interfering with one another while averting catastrophic forgetting.

</div>

Setup
Data Preparation for Training MACE
Training MACE to Erase Concepts
Sampling from the Modified Model
MACE Finetuned Model Weights
Metrics Evaluation
Acknowledgments
Citation

<br>

Setup

Creating a Conda Environment

git clone https://github.com/Shilin-LU/MACE.git
conda create -n mace python=3.10
conda activate mace
conda install pytorch==2.0.1 torchvision==0.15.2 pytorch-cuda=11.7 -c pytorch -c nvidia

Install Grounded-SAM (Official Version) to Prepare Masks for LoRA Tuning

Note: This installation process can be complex. You may skip this section and use the HuggingFace version to prepare data instead. The official version can run on a 24GB GPU, while the Hugging Face version likely requires above 28GB.

export AM_I_DOCKER=False
export BUILD_WITH_CUDA=True
# export CUDA_HOME=/path/to/cuda-11.7/

cd MACE
git clone https://github.com/IDEA-Research/Grounded-Segment-Anything.git
cd Grounded-Segment-Anything

# Install Segment Anything:
python -m pip install -e segment_anything

# Install Grounding DINO:
pip install --no-build-isolation -e GroundingDINO

# Install osx:
git submodule update --init --recursive
cd grounded-sam-osx && bash install.sh

# Install RAM & Tag2Text:
git clone https://github.com/xinyu1205/recognize-anything.git
pip install -r ./recognize-anything/requirements.txt
pip install -e ./recognize-anything/

Download the pretrained weights of Grounded-SAM.

cd ..    # cd Grounded-Segment-Anything

# Download the pretrained groundingdino-swin-tiny model:
wget https://github.com/IDEA-Research/GroundingDINO/releases/download/v0.1.0-alpha/groundingdino_swint_ogc.pth

# Download the pretrained SAM model:
wget https://huggingface.co/lkeab/hq-sam/resolve/main/sam_hq_vit_h.pth

Install Other Dependencies

pip install diffusers==0.22.0 transformers==4.46.2 huggingface_hub==0.25.2
pip install accelerate openai omegaconf opencv-python

LLM Provider for Text Augmentation

MACE uses an LLM to generate text augmentations (captions) for the concepts being erased. By default it uses OpenAI (gpt-3.5-turbo), but you can switch to MiniMax (M2.7 / M2.5) via environment variables — no code changes required.

| Variable | Default | Description | |---|---|---| | LLM_PROVIDER | openai | LLM backend: openai or minimax | | MINIMAX_API_KEY | — | Your MiniMax API key | | MINIMAX_MODEL | MiniMax-M2.7 | Model: MiniMax-M2.7, MiniMax-M2.7-highspeed, MiniMax-M2.5, MiniMax-M2.5-highspeed | | OPENAI_API_KEY | — | Your OpenAI API key (when using the default provider) |

Using MiniMax:

export LLM_PROVIDER=minimax
export MINIMAX_API_KEY=your_api_key_here
# Optional: choose a specific model
export MINIMAX_MODEL=MiniMax-M2.7   # or MiniMax-M2.5-highspeed for faster inference

python training.py configs/object/erase_ship.yaml

MiniMax's API is OpenAI-compatible (base URL https://api.minimax.io/v1), so the same openai Python SDK is reused — no extra dependencies needed.

Data Preparation for Training MACE

To erase concepts, 8 images along with their respective segmentation masks should be generated for each concept. To prepare the data for your intended concept, configure your settings in configs/object/erase_ship.yaml and execute the command:

Grounded SAM (HuggingFace Version) (~28GB RAM)

In order to ease the configuration of environment, you can also use transformers-based grounded sam from the file data_preparation_transformers.py. It does not require the CUDA version as long as you can run transformers library.

CUDA_VISIBLE_DEVICES=0 python data_preparation_transformers.py configs/object/erase_ship.yaml

All you need to do is to determine the deterctor_id and segmenter_id, the default value is detector_id = "IDEA-Research/grounding-dino-base" and segmenter_id = "facebook/sam-vit-huge" in the file. You can also change the threshold hyperparameter to get refined mask.

Grounded SAM (Official Version) (<24GB RAM)

CUDA_VISIBLE_DEVICES=0 python data_preparation.py configs/object/erase_ship.yaml

Download Pre-cached Files

Before beginning the mass concept erasing process, ensure that you have pre-cached the prior knowledge (e.g., MSCOCO) and domain-specific knowledge (e.g., certain celebrities, artistic styles, or objects) you wish to retain.

You can download our pre-cached files from this OneDrive folder. Once downloaded, place these files in the ./cache/ for use.
Alternatively, to preserve additional knowledge of your choice, you can cache the information by modifying the script src/cache_coco.py.

Training MACE to Erase Concepts

After preparing the data, you can specify your training parameters in the same configuration file configs/object/erase_ship.yaml and run the following command:

CUDA_VISIBLE_DEVICES=0 python training.py configs/object/erase_ship.yaml

Sampling from the Finetuned Model

The finetuned model can be simply tested by running the following command to generate several images:

CUDA_VISIBLE_DEVICES=0 python inference.py \
          --num_images 3 \
          --prompt 'your_prompt' \
          --model_path /path/to/saved_model/LoRA_fusion_model \
          --save_path /path/to/save/folder

To produce lots of ima

MACE

Install / Use

README