ReDi
[NeurIPS'25 Spotlight] Boosting Generative Image Modeling via Joint Image-Feature Synthesis
Install / Use
/learn @zelaki/ReDiREADME

🔥 News
-
[2025/9/19] ReDi is accepted at NeurIPS 2025 as a Spotlight! 🎉
-
[2025/9/3] Training code for REPA loss on top of ReDi released! 🎉
-
[2025/6/7] Training code for SiT w/ ReDi released! 🎉
Setup
Download and set up the repo:
git clone https://github.com/zelaki/ReDi.git
cd ReDi
We provide an environment.yml file that can be used to create a Conda environment.
conda env create -f environment.yml
conda activate ReDi
Weights for our SiT-XL/2 w/ ReDi model trained for 600 epochs on ImageNet256x256 can be downloaded from Hugging Face 🤗:
| Model | Epochs | FID | SFID | IS | Pre | Rec | |---------------------|---------|---------|----------|--------|----------|---------| | SiT-XL/2 w/ ReDi | 600 | 1.64 | 4.63 | 289.3 | 0.65 | 0.77 |
Sampling

You can sample from our pre-trained ReDi models with sample.py.
python sample.py SDE --image-size 256 --seed 42 --ckpt /path/to/ckpt
Sample and Evaluate
First download the ImageNet reference batch from ADM
You can use sample_ddp.py script to sample a large number of images in parallel. This script generates a folder of samples as well as a .npz and directly uses with ADM's TensorFlow evaluation suite to compute FID, Inception Score and other metrics. For example, to sample 50K images from our pre-trained ReDi model over N GPUs, run:
torchrun --nnodes=1 --nproc_per_node=N sample_ddp.py SDE --model SiT-XL/2 --num-fid-samples 50000 --pca-rank 8 --ckpt pretrained_models/SiT-ReDi-XL-2.pt --cfg-scale 2.4 --cfg-vae True --ref-batch VIRTUAL_imagenet256_labeled.npz
Represenation Guidance
--rg-scale: Set >1.0 to use Representation Guidance during sampling. Note: Will work only for models trained with--dino-drop-prob> 0.
Data Preprocessing
First, download ImageNet and follow the preprocessing guide from REPA repository.
DINOv2 PCA model
We provide a pre-computed full rank PCA model. You can adjust the number of PCs during training. If you want to re-compute the PCA mode,l you can use the following script:
torchrun --nnodes=1 --nproc_per_node=1 calc_pca.py --feature-path "/path/to/your/local/features_dir"
By default, we use 300 batches with batch-size 256 for PCA.
Training
To train ReDi w/ SiT use the following script:
torchrun --nnodes 1 --nproc_per_node 8 train_redi.py \
--model "SiT-XL/2" \
--feature-path "/path/to/your/local/features_dir" \
--pca-rank 8 \
--pca-model-path ./pcs/dino_pca_model.pth
Then this script will automatically create the folder in results to save logs and checkpoints. You can adjust the following options:
--models:[SiT-B/2, SiT-L/2, SiT-XL/2]--pca-rank: Number of DINOv2 PC to use for joint training--pca-model-path: Path to precomputed PCA model--dino-drop-prob: Set to 0.2 if you plan to use Representation Guidance during inference
Add REPA loss
To enable REPA loss on top of ReDi use the following arguments:
--repa-lossTrue--repa-layer: Layer to apply REPA projection.--repa-weight: Weight of REPA loss.
Acknowledgement
This code is mainly built upon REPA, SiT and fastDiT repositories.
Citation
article{kouzelis2025boosting,
title={Boosting Generative Image Modeling via Joint Image-Feature Synthesis},
author={Kouzelis, Theodoros and Karypidis, Efstathios and Kakogeorgiou, Ioannis and Gidaris, Spyros and Komodakis, Nikos},
journal={arXiv preprint arXiv:2504.16064},
year={2025}
}
Related Skills
YC-Killer
2.7kA library of enterprise-grade AI agents designed to democratize artificial intelligence and provide free, open-source alternatives to overvalued Y Combinator startups. If you are excited about democratizing AI access & AI agents, please star ⭐️ this repository and use the link in the readme to join our open source AI research team.
groundhog
398Groundhog's primary purpose is to teach people how Cursor and all these other coding agents work under the hood. If you understand how these coding assistants work from first principles, then you can drive these tools harder (or perhaps make your own!).
last30days-skill
18.3kAI agent skill that researches any topic across Reddit, X, YouTube, HN, Polymarket, and the web - then synthesizes a grounded summary
sec-edgar-agentkit
10AI agent toolkit for accessing and analyzing SEC EDGAR filing data. Build intelligent agents with LangChain, MCP-use, Gradio, Dify, and smolagents to analyze financial statements, insider trading, and company filings.
