SkillAgentSearch skills...

DiMO

[ICCV2025] "Di[M]O: Distilling Masked Diffusion Models into One-step Generator", Yuanzhi Zhu, Xi Wang, Stéphane Lathuilière, Vicky Kalogeiton

Install / Use

/learn @yuanzhi-zhu/DiMO
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

Di[𝙼]O: Distilling Masked Diffusion Models into One-step Generator

<a href='https://yuanzhi-zhu.github.io/DiMO/'><img src='https://img.shields.io/badge/Project-Page-Green'></a> <a href='https://arxiv.org/abs/2503.15457'><img src='https://img.shields.io/badge/Di[M]O-Arxiv-red'></a> <a href='https://huggingface.co/Yuanzhi/DiMO'><img src='https://img.shields.io/badge/🤗HuggingFace-Models-orange'></a> <a href='https://www.alphaxiv.org/overview/2503.15457'><img src='https://img.shields.io/badge/alphaXiv-Blog-blue'></a>

TLDR

We develop algorithm to distill MDMs into one-step generator, by matching the output distribution of teacher and student model. <img width="1385" alt="image" src="https://yuanzhi-zhu.github.io/DiMO/static/images/illustration.png" />

Setup

Clone and Install

git clone https://github.com/yuanzhi-zhu/DiMO.git
cd DiMO
pip install -r requirements.txt

Training Code

Download the pre-trained Models

MaskGit (pytorch): follow https://github.com/valeoai/Halton-MaskGIT/tree/v1.0

Messonic: huggingface model name: "meissonflow/meissonic"

Train MaskGit

bash scripts/launch_DIMO_MaskGit.sh

Train Messonic

bash scripts/launch_DIMO_Meissonic.sh

Inference Code

Download the Distilled Models

# pwd
# /path/to/DiMO
huggingface-cli download Yuanzhi/DiMO --local-dir models

Sample Images MaskGit

network_dir="./models/maskgit"
CUDA_VISIBLE_DEVICES=0 torchrun --nproc_per_node=1 sample_MaskGit.py \
    --save_dir ./samples/test_sample \
    --vit_path $network_dir \
    --mode sample \
    --nb_sample 64

Sample Images Meissonic

network_dir='./models/meissonic'
CUDA_VISIBLE_DEVICES=0 torchrun \
        --nproc_per_node=1 \
        --master_port=29501 \
    sample_Meissonic.py \
    --save_dir ./samples/Meissonic_sample \
    --vit_path $network_dir \
    --mode sample \

Citation

If you find this repo helpful, please cite:

@article{zhu2025di,
      title={Di$\mathtt{[M]}$O: Distilling Masked Diffusion Models into One-step Generator},
      author={Zhu, Yuanzhi and Wang, Xi and Lathuili{\`e}re, St{\'e}phane and Kalogeiton, Vicky},
      journal={arXiv preprint arXiv:2503.15457},
      year={2025}
    }

Acknowledgments

This work was supported by ANR-22-CE23-0007, ANR-22-CE39-0016, Hi!Paris grant and fellowship, DATAIA Convergence Institute as part of the “Programme d'Investissement d'Avenir” (ANR-17-CONV-0003) operated by Ecole Polytechnique, IP Paris, and was granted access to the IDRIS High-Performance Computing (HPC) resources under the allocation 2024-AD011014300R1 and 2025-AD011015894 made by GENCI and mesoGIP of IP Paris. We also sincerely thank Nacereddine Laddaoui for the help with infrastructure, Haoge Deng and Yao Teng for their insightful discussions that contributed to this work. We are also grateful to Nicolas Dufour, Robin Courant, and Lucas Degeorge for their meticulous proofreading.

This codebase is based on the MaskGit PyTorch and Meissonic.

View on GitHub
GitHub Stars35
CategoryDevelopment
Updated6d ago
Forks1

Languages

Python

Security Score

80/100

Audited on Mar 17, 2026

No findings