DEPICT

[NeurIPS2024] Official Pytorch implementation of Rethinking Decoders for Transformer-based Semantic Segmentation: Compression is All You Need

Generate Convert Improve

Install / Use

/learn @QishuaiWen/DEPICT

About this skill

Quality Score

0/100

README

<div align="center"> <h1>DEPICT (DEcoder for PrIncipled semantiC segmenTation)</h1> </div>

This repository is the official Pytorch implementation of paper Rethinking Decoders for Transformer-based Semantic Segmentation: A Compression Perspective by Qishuai Wen and Chun-Guang Li, NeurIPS2024.

<p align="center"> <img src="DEPICT.png" width="800px"/> <br> <em>DEPICT overview</em> </p>

📣 News

[2026/4/3] Our new paper, MiTA Attention, includes a linear-time Transformer for semantic segmentation (see Tab. 5).

[2025/9/19] Our follow-up paper has been accepted to NeurIPS 2025 as a Spotlight🌟! See CBSA.

[2024/9/26] This paper has been accepted to NeurIPS 2024 as a Poster!

📊 Models

We release our models trained on the ADE20K dataset, including variants of DEPICT-SA and DEPICT-CA.

<div align="center"> <img src="assets/performance.png" width="270px"/> <img src="assets/noise_tiny.png" width="270px"/> <img src="assets/noise_small.png" width="270px"/> <br> <em>Performance comparison</em> </div>

📝 Reproduction&Training Guidelines

Install Segmenter via

git clone https://github.com/rstrudel/segmenter ./segmenter

or otherwise, and prepare the datasets by following the instructions in Segmenter.
For example, run

pip install -r requirements.txt  
export DATASET=datasets  
python -m segm.scripts.prepare_ade20k $DATASET

Additionally, run

pip install scipy

and create a new folder named "log".
Once done, the file structure is as follows:

segmenter/  
├── datasets/  
│   ├── ade20k  
│   │   ├── ADEChallengeData2016  
│   │   └── release_test  
│   └── ...  
├── log/  
├── segm/  
│   ├── model/  
│   ├── config.py  
│   └── ...  
├── requirements.txt  
└── ...

Then replace the folder "model" and the file "config.py" with ours, and upload our our model folders, such as "DEPICT-SA-Small", into the folder "log".
Finally, evaluate the models via

# single-scale evaluation:
python -m segm.eval.miou log/DEPICT-SA-Small/checkpoint.pth ade20k --singlescale
# multi-scale evaluation:
python -m segm.eval.miou log/DEPICT-SA-Small/checkpoint.pth ade20k --multiscale

or re-train it via

python -m segm.train --log-dir log/DEPICT-SA-Small --dataset ade20k --backbone vit_small_patch16_384 --decoder mask_transformer

P.S. To evaluate DEPICT-CA, line 19 of model/decoder.py should be "mode='ca'". We aim to make minimal modifications to the Segmenter code, keeping all differences confined to the config.yml file and the model folder we released above.

Acknowledgements

Our work and code are inspired by and built upon CRATE (Yu et al., 2023) and Segmenter (Strudel et al., 2021). The source of the above image examples is d2l.ai.

Related Skills

node-connect

346.8k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

frontend-design

107.6k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

openai-whisper-api

346.8k

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

qqbot-media

346.8k

QQBot 富媒体收发能力。使用 <qqmedia> 标签，系统根据文件扩展名自动识别类型（图片/语音/视频/文件）。