SkillAgentSearch skills...

Unidisc

UniDisc: A discrete diffusion model for joint multimodal generation, enabling controllable and efficient text-image synthesis, editing, and inpainting.

Install / Use

/learn @alexanderswerdlow/Unidisc
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

<div align="center"> <br> <img src="docs/images/banner.webp" width="1000"> <h3>Unified Multimodal Discrete Diffusion</h3>

Alexander Swerdlow<sup>1*</sup>  Mihir Prabhudesai<sup>1*</sup>  Siddharth Gandhi<sup>1</sup>  Deepak Pathak<sup>1</sup>  Katerina Fragkiadaki<sup>1</sup>  <br>

<sup>1</sup> Carnegie Mellon University 

ArXiv Webpage

<!-- [![Demo](https://img.shields.io/badge/Demo-Custom-<COLOR>.svg)](https://huggingface.co/spaces/todo) --> </div>

Hugging Face models

The UniDisc checkpoints are available on Hugging Face:

Getting Started

To install the dependencies, run:

git submodule update --init --recursive
uv sync --no-group dev
uv sync

For a more detailed installation guide, please refer to INSTALL.md.

Data

See DATA.md for details on how to download and preprocess the datasets. We provide processing scripts and instructions for all of the used datasets. Additionally, we release a synthetic dataset available here and the corresponding generation scripts as well as the raw data.

Training

See TRAIN.md for training commands.

Inference

Interactive demo:

mkdir -p ./ckpts/unidisc_interleaved
huggingface-cli download aswerdlow/unidisc_interleaved --local-dir ./ckpts/unidisc_interleaved
uv run demo/server.py experiments='[large_scale_train,large_scale_train_high_res_interleaved,eval_unified,large_scale_high_res_interleaved_inference]' trainer.load_from_state_dict="./ckpts/unidisc_interleaved/unidisc_interleaved.pt"
uv run demo/client.py

Training

See TRAINING.md for details.

Evaluation

See EVAL.md for details.

Citation

To cite our work, please use the following:

@article{swerdlow2025unidisc,
  title = {Unified Multimodal Discrete Diffusion},
  author = {Swerdlow, Alexander and Prabhudesai, Mihir and Gandhi, Siddharth and Pathak, Deepak and Fragkiadaki, Katerina},
  journal = {arXiv preprint arXiv:2503.20853},
  year = {2025},
  doi = {10.48550/arXiv.2503.20853},
}

Credits

This repository is built on top of the following repositories:

Related Skills

View on GitHub
GitHub Stars137
CategoryDevelopment
Updated7d ago
Forks6

Languages

Python

Security Score

80/100

Audited on Mar 31, 2026

No findings