[CVPR 2024] Stronger, Fewer, & Superior: Harnessing Vision Foundation Models for Domain Generalized Semantic Segmentation

zhixiang wei1, lin chen2, et al. 1 University of Science of Techonology of China 2 Shanghai AI Laboratory

Project page: https://zxwei.site/rein

Paper: https://arxiv.org/pdf/2312.04265.pdf

Rein is a efficient and robust fine-tuning method, specifically developed to effectively utilize Vision Foundation Models (VFMs) for Domain Generalized Semantic Segmentation (DGSS). It achieves SOTA on Cityscapes to ACDC, and GTAV to Cityscapes+Mapillary+BDD100K. Using only synthetic data, Rein achieved an mIoU of 78.4% on Cityscapes validation set! Using only the data from the Cityscapes training set, we achieved an average mIoU of 77.6% on ACDC test set! Rein Framework

Visualization

Trained on Cityscapes, Rein generalizes to unseen driving scenes and cities: Nighttime Shanghai, Foggy Countryside, and Rainy Hollywood.

![night]

![rain]

![fog]

🔥 News!

🔥 Welcome to check out our latest work: Rein++: Efficient Generalization and Adaptation for Semantic Segmentation with Vision Foundation Models!
🔥 Delighted to announce that ours work HQCLIP: Leveraging Vision-Language Models to Create High-Quality Image-Text Datasets and CLIP Models were accepted by ICCV 2025!
🔥 We warmly congratulate SoMA (https://ysj9909.github.io/SoRA.github.io/) for receiving a CVPR 2025 Highlight! Built upon the Rein codebase, SoMA demonstrates outstanding semantic segmentation performance and has even successfully accomplished object detection tasks!
🔥 To facilitate users in integrating reins into their own projects, we provide a simplified version of reins: simple_reins. With this version, users can easily use reins as a feature extractor. (Note: This version has removed features related to mask2former)
We have uploaded the config for ResNet and ConvNeXt.
We have uploaded the checkpoint and config for +1/16 of Cityscapes training set, and it get 82.5% on the Cityscapes validation set!
Rein is accepted in CVPR2024!
Using only the data from the Cityscapes training set, we achieved an average mIoU of 77.56% on the ACDC test set! This result ranks first in the DGSS methods on the ACDC benchmark! Checkpoint is avaliable at release.
Using only synthetic data (UrbanSyn, GTAV, and Synthia), Rein achieved an mIoU of 78.4% on Cityscapes! Checkpoint is avaliable at release.

Performance Under Various Settings (DINOv2).

Performance For Various Backbones (Trained on GTAV).

Citation

If you find our code or data helpful, please cite our paper:

@InProceedings{Wei_2024_CVPR,
    author    = {Wei, Zhixiang and Chen, Lin and Jin, Yi and Ma, Xiaoxiao and Liu, Tianle and Ling, Pengyang and Wang, Ben and Chen, Huaian and Zheng, Jinjin},
    title     = {Stronger Fewer \& Superior: Harnessing Vision Foundation Models for Domain Generalized Semantic Segmentation},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month     = {June},
    year      = {2024},
    pages     = {28619-28630}
}

Try and Test

Experience the demo: Users can open [

Rein

Install / Use

README