SlotFormer

Code release for ICLR 2023 paper: SlotFormer on object-centric dynamics models

Generate Convert Improve

Install / Use

/learn @pairlab/SlotFormer

About this skill

Quality Score

0/100

README

SlotFormer

SlotFormer: Unsupervised Visual Dynamics Simulation with Object-Centric Models<br/> Ziyi Wu, Nikita Dvornik, Klaus Greff, Thomas Kipf, Animesh Garg<br/> ICLR'23 | GitHub | arXiv | Project page

Ground-Truth Our Prediction | Ground-Truth Our Prediction :--------------------------------------------------:|:--------------------------------------------------: |

Introduction

This is the official PyTorch implementation for paper: SlotFormer: Unsupervised Visual Dynamics Simulation with Object-Centric Models, which is accepted by ICLR 2023. The code contains:

Training base object-centric slot models
Video prediction task on OBJ3D and CLEVRER
VQA task on CLEVRER
VQA task on Physion
Planning task on PHYRE

Update

2023.9.20: BC-breaking change! We fix an error in the mIoU calculation code. This won't change the order of benchmarked methods, but will change their absolute values. See this PR for more details. Please re-run the evaluation code on your trained models to get the correct results. The updated mIoU of SlotFormer on CLEVRER is 49.42 (using the provided pre-trained weight)
2023.1.20: The paper is accepted by ICLR 2023!
2022.10.26: Support Physion VQA task and PHYRE planning task
2022.10.16: Initial code release!
- Support base object-centric model training
- Support SlotFormer training
- Support evaluation on the video prediction task
- Support evaluation on the CLEVRER VQA task

Installation

Please refer to install.md for step-by-step guidance on how to install the packages.

Experiments

This codebase is tailored to Slurm GPU clusters with preemption mechanism. For the configs, we mainly use RTX6000 with 24GB memory (though many experiments don't require so much memory). Please modify the code accordingly if you are using other hardware settings:

Please go through scripts/train.py and change the fields marked by TODO:
Please read the config file for the model you want to train. We use DDP with multiple GPUs to accelerate training. You can use less GPUs to achieve a better memory-speed trade-off

Dataset Preparation

Please refer to data.md for steps to download and pre-process each dataset.

Reproduce Results

Please see benchmark.md for detailed instructions on how to reproduce our results in the paper.

Citation

Please cite our paper if you find it useful in your research:

@article{wu2022slotformer,
  title={SlotFormer: Unsupervised Visual Dynamics Simulation with Object-Centric Models},
  author={Wu, Ziyi and Dvornik, Nikita and Greff, Klaus and Kipf, Thomas and Garg, Animesh},
  journal={arXiv preprint arXiv:2210.05861},
  year={2022}
}

Acknowledgement

We thank the authors of Slot-Attention, slot_attention.pytorch, SAVi, RPIN and Aloe for opening source their wonderful works.

License

SlotFormer is released under the MIT License. See the LICENSE file for more details.

Contact

If you have any questions about the code, please contact Ziyi Wu dazitu616@gmail.com

Related Skills

qqbot-channel

343.3k

QQ 频道管理技能。查询频道列表、子频道、成员、发帖、公告、日程等操作。使用 qqbot_channel_api 工具代理 QQ 开放平台 HTTP 接口，自动处理 Token 鉴权。当用户需要查看频道、管理子频道、查询成员、发布帖子/公告/日程时使用。

docs-writer

99.7k

`docs-writer` skill instructions As an expert technical writer and editor for the Gemini CLI project, you produce accurate, clear, and consistent documentation. When asked to write, edit, or revie

model-usage

343.3k

Use CodexBar CLI local cost usage to summarize per-model usage for Codex or Claude, including the current (most recent) model or a full model breakdown. Trigger when asked for model-level usage/cost data from codexbar, or when you need a scriptable per-model summary from codexbar cost JSON.

project-overview

FlightPHP Skeleton Project Instructions This document provides guidelines and best practices for structuring and developing a project using the FlightPHP framework. Instructions for AI Coding A

pairlab

View profile

View on GitHub

GitHub Stars120

CategoryContent

Updated2mo ago

Forks22

pairlab/SlotFormer

Languages

Python

Security Score

100/100

Audited on Jan 27, 2026

No findings