EcoDiff

[ICLR2026] Learnable Sparsity for Vision Generative Models

Generate Convert Improve

Install / Use

/learn @YaNgZhAnG-V5/EcoDiff

About this skill

Quality Score

0/100

README

Learnable Sparsity for Vision Generative Models

Authors: Yang Zhang, Er Jin, Wenzhong Liang, Yanfei Dong, Ashkan Khakzar, Philip Torr, Johannes Stegmaier, Kenji Kawaguchi

Official implementation of ICLR2026 "Learnable Sparsity for Vision Generative Models" - a novel approach for memory efficient diffusion model pruning.

TL;DR: A model-agnostic structural pruning framework that achieves up to 20% parameter reduction with minimal performance loss through differentiable mask learning and time step gradient checkpointing.

teaser

<details> <summary>Table of Contents</summary> <ol> <li><a href="#overview">Overview</a></li> <li><a href="#%EF%B8%8F-installation">Installation</a></li> <li><a href="#-quick-start">Quick Start</a></li> <li><a href="#advanced-usage">Advanced Usage</a> <ul> <li><a href="#pruning-training">Pruning Training</a></li> <li><a href="#hyperparameter-tuning">Hyperparameter Tuning</a></li> <li><a href="#evaluation">Evaluation</a></li> <li><a href="#fine-tuning-after-pruning">Fine-tuning After Pruning</a></li> </ul> </li> <li><a href="#configuration-files">Configuration Files</a></li> <li><a href="#%EF%B8%8F-development">Development</a></li> <li><a href="#repository-structure">Repository Structure</a></li> <li><a href="#models">Models</a></li> <li><a href="#-model-weights">Model Weights</a></li> <li><a href="#-citation">Citation</a></li> <li><a href="#license">License</a></li> <li><a href="#acknowledgments">Acknowledgments</a></li> </ol> </details>

Overview

method

EcoDiff introduces a model-agnostic structural pruning framework that learns differentiable masks to sparsify diffusion models. Key innovations include:

✨ Model-agnostic pruning for various diffusion architectures
🧪 Differentiable mask learning allowing end-to-end optimization
🧵 Time step gradient checkpointing for memory-efficient training
📉 Up to 20% parameter reduction with minimal performance loss

⚙️ Installation

Requirements

Python 3.10+
Anaconda or Miniconda
CUDA-compatible GPU

Setup

# Create conda environment
conda create -n sdib python=3.10 -y
conda activate sdib

# Clone repository
git clone https://github.com/your-repo/ecodiff.git
cd ecodiff

# Install dependencies
pip install -e .[core,loggers,test]

Environment Configuration

Create a .env file:

PYTHON=/path/to/miniconda3/envs/sdib/bin/python
RESULTS_DIR=/path/to/ecodiff/results
CONFIG_DIR=/path/to/ecodiff/configs

🚀 Quick Start

1. Basic Pruning

# SDXL pruning
make visual cfg=sdxl

# FLUX pruning
make visual cfg=flux

2. Hyperparameter Tuning

# Generate configurations
python scripts/utils/hyperparameter_tuning.py --config configs/sdxl.yaml --task gen

# Run tuning
python scripts/utils/hyperparameter_tuning.py --task run --max_job 2

3. Evaluation

# Semantic evaluation
python scripts/evaluation/semantic_eval.py -sp <checkpoint_path> --task all

# Mask analysis
python scripts/evaluation/binary_mask_eval.py --ckpt <checkpoint_path> -lt 0.001

Advanced Usage

Pruning Training

# Direct training script
python scripts/train.py

# Development/debugging mode
make visual cfg=sdxl
make visual cfg=flux

Hyperparameter Tuning

# Generate configuration files
python scripts/utils/hyperparameter_tuning.py \
  --config configs/sdxl.yaml \
  --output_dir configs/param_sdxl_tuning \
  -lr 0.1 0.2 \
  -mask "hard_discrete" \
  -re ".*" \
  -lreg 1 0 \
  -lrec 1 2 \
  -b 0.1 0.01 \
  -d 2 \
  -pn sdxl_pruning \
  --task gen

# Run tuning jobs
python scripts/utils/hyperparameter_tuning.py \
  --output_dir configs/param_sdxl_tuning \
  --task run \
  --max_job 2

Evaluation

# Generate semantic evaluation
python scripts/evaluation/semantic_eval.py -sp <checkpoint_path> --task gen

# Run all semantic evaluations
python scripts/evaluation/semantic_eval.py -sp <checkpoint_path> --task all

# Binary mask evaluation with threshold
python scripts/evaluation/binary_mask_eval.py --ckpt <checkpoint_path> -lt 0.001

Fine-tuning After Pruning

# SDXL LoRA fine-tuning
bash scripts/retraining/train_text_to_image_lora_sdxl.sh 30 0

# FLUX LoRA fine-tuning
bash scripts/retraining/train_text_to_image_lora_flux.sh 30 0

Load Pruned Models

python scripts/load_pruned_model.py

Configuration Files

The framework uses YAML configuration files located in the configs/ directory:

configs/
├── dit.yaml          # Diffusion Transformers configuration
├── flux.yaml         # FLUX.1 Schnell model configuration
├── flux_dev.yaml     # FLUX.1 Dev model configuration  
├── sd2.yaml          # Stable Diffusion v2 configuration
├── sd3.yaml          # Stable Diffusion 3 configuration
└── sdxl.yaml         # Stable Diffusion XL configuration

🛠️ Development

For developers contributing to the project:

# Install development dependencies
pip install pre-commit && pre-commit install

# Run tests
make test

# Format code
make format

# Clean generated files
make clean

Repository Structure

src/sdib/ - Core pruning framework
scripts/ - Training and evaluation scripts
configs/ - Model configuration files

Models

Supported

SDXL: Stable Diffusion XL
FLUX.1: FLUX diffusion models

Experimental

These models are currently experimental implementations. They may require additional hyperparameter tuning for optimal performance.

DiT: Diffusion Transformers
SD2: Stable Diffusion v2
SD3: Stable Diffusion 3

🤗 Model Weights

Pre-trained pruned models and retrained weights are available on HuggingFace:

| Model | Type | Link | |-------|------|------| | SDXL | Pruned | EcoDiff-SDXL-Pruned | | FLUX (Schnell & Dev) | Pruned | EcoDiff-FLUX-Pruned | | SDXL | Retrained (Full & LoRA) | EcoDiff-SDXL-Retrain-Weights | | FLUX | Retrained (LoRA) | EcoDiff-FLUX-Retrain-Weights |

📝 Citation

@inproceedings{zhang2026learnable,
  title={Learnable Sparsity for Vision Generative Models},
  author={Zhang, Yang and Jin, Er and Liang, Wenzhong and Dong, Yanfei and Khakzar, Ashkan and Torr, Philip and Stegmaier, Johannes and Kawaguchi, Kenji},
  booktitle={The Fourteenth International Conference on Learning Representations},
  year={2026},
  url={https://openreview.net/forum?id=9pNWZLVZ4r}
}

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

Built on Diffusers library
Supports models from Stability AI and Black Forest Labs

Related Skills

node-connect

352.2k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

frontend-design

111.1k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

openai-whisper-api

352.2k

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

qqbot-media

352.2k

QQBot 富媒体收发能力。使用 <qqmedia> 标签，系统根据文件扩展名自动识别类型（图片/语音/视频/文件）。