HRDA
[ECCV22] Official Implementation of HRDA: Context-Aware High-Resolution Domain-Adaptive Semantic Segmentation
Install / Use
/learn @lhoyer/HRDAREADME
HRDA: Context-Aware High-Resolution Domain-Adaptive Semantic Segmentation
by Lukas Hoyer, Dengxin Dai, and Luc Van Gool
[ECCV22 Paper] [Extension Paper]
:bell: News:
- [2024-07-03] We are happy to announce that our work SemiVL on semi-supervised semantic segmentation with vision-language guidance was accepted at ECCV24.
- [2024-07-03] We are happy to announce that our follow-up work DGInStyle on image diffusion for domain-generalizable semantic segmentation was accepted at ECCV24.
- [2023-09-26] We are happy to announce that our Extension Paper on domain generalization and clear-to-adverse-weather UDA was accapted at PAMI.
- [2023-08-25] We are happy to announce that our follow-up work EDAPS on panoptic segmentation UDA was accepted at ICCV23.
- [2023-04-27] We further extend HRDA to domain generalization and clear-to-adverse-weather UDA in the Extension Paper.
- [2023-02-28] We are happy to announce that our follow-up work MIC on context-enhanced UDA was accepted at CVPR23.
- [2022-07-05] We are happy to announce that HRDA was accepted at ECCV22.
Overview
Unsupervised domain adaptation (UDA) aims to adapt a model trained on synthetic data to real-world data without requiring expensive annotations of real-world images. As UDA methods for semantic segmentation are usually GPU memory intensive, most previous methods operate only on downscaled images. We question this design as low-resolution predictions often fail to preserve fine details. The alternative of training with random crops of high-resolution images alleviates this problem but falls short in capturing long-range, domain-robust context information.
Therefore, we propose HRDA, a multi-resolution training approach for UDA, that combines the strengths of small high-resolution crops to preserve fine segmentation details and large low-resolution crops to capture long-range context dependencies with a learned scale attention, while maintaining a manageable GPU memory footprint.

HRDA enables adapting small objects and preserving fine segmentation details. It significantly improves the state-of-the-art performance by 5.5 mIoU for GTA→Cityscapes and by 4.9 mIoU for Synthia→Cityscapes, resulting in an unprecedented performance of 73.8 and 65.8 mIoU, respectively.

The more detailed domain-adaptive semantic segmentation of HRDA, compared to the previous state-of-the-art UDA method DAFormer, can also be observed in example predictions from the Cityscapes validation set.

https://user-images.githubusercontent.com/1277888/181128057-27b8039f-a4c9-4f6d-9aa8-9b7f364d8921.mp4

HRDA can be further extended to domain generalization lifting the requirement of access to target images. Also in domain generalization, HRDA significantly improves the state-of-the-art performance by +4.2 mIoU.
For more information on HRDA, please check our [ECCV Paper] and the [Extension Paper].
If you find HRDA useful in your research, please consider citing:
@InProceedings{hoyer2022hrda,
title={{HRDA}: Context-Aware High-Resolution Domain-Adaptive Semantic Segmentation},
author={Hoyer, Lukas and Dai, Dengxin and Van Gool, Luc},
booktitle={Proceedings of the European Conference on Computer Vision (ECCV)},
pages={372--391},
year={2022}
}
@Article{hoyer2024domain,
title={Domain Adaptive and Generalizable Network Architectures and Training Strategies for Semantic Image Segmentation},
author={Hoyer, Lukas and Dai, Dengxin and Van Gool, Luc},
journal={IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI)},
year={2024},
volume={46},
number={1},
pages={220-235},
doi={10.1109/TPAMI.2023.3320613}
}
Comparison with SOTA UDA
HRDA significantly outperforms previous works on several UDA benchmarks. This includes synthetic-to-real adaptation on GTA→Cityscapes and Synthia→Cityscapes as well as clear-to-adverse-weather adaptation on Cityscapes→ACDC and Cityscapes→DarkZurich.
| | GTA→CS(val) | Synthia→CS(val) | CS→ACDC(test) | CS→DarkZurich(test) | |---------------------|----------------|--------------------|---------------|---------------------| | ADVENT [1] | 45.5 | 41.2 | 32.7 | 29.7 | | BDL [2] | 48.5 | -- | 37.7 | 30.8 | | FDA [3] | 50.5 | -- | 45.7 | -- | | DACS [4] | 52.1 | 48.3 | -- | -- | | ProDA [5] | 57.5 | 55.5 | -- | -- | | MGCDA [6] | -- | -- | 48.7 | 42.5 | | DANNet [7] | -- | -- | 50.0 | 45.2 | | DAFormer (Ours) [8] | 68.3 | 60.9 | 55.4* | 53.8* | | HRDA (Ours) | 73.8 | 65.8 | 68.0* | 55.9* |
* New results of our extension paper
References:
- Vu et al. "Advent: Adversarial entropy minimization for domain adaptation in semantic segmentation" in CVPR 2019.
- Li et al. "Bidirectional learning for domain adaptation of semantic segmentation" in CVPR 2019.
- Yang et al. "Fda: Fourier domain adaptation for semantic segmentation" in CVPR 2020.
- Tranheden et al. "Dacs: Domain adaptation via crossdomain mixed sampling" in WACV 2021.
- Zhang et al. "Prototypical pseudo label denoising and target structure learning for domain adaptive semantic segmentation" in CVPR 2021.
- Sakaridis et al. "Map-guided curriculum domain adaptation and uncertaintyaware evaluation for semantic nighttime image segmentation" in TPAMI, 2020.
- Wu et al. "DANNet: A one-stage domain adaptation network for unsupervised nighttime semantic segmentation" in CVPR, 2021.
- Hoyer et al. "DAFormer: Improving Network Architectures and Training Strategies for Domain-Adaptive Semantic Segmentation" in CVPR, 2022.
Comparison with SOTA Domain Generalization (DG)
HRDA and DAFormer significantly outperform previous works on domain generalization from GTA to real street scenes.
| DG Method | Cityscapes | BDD100K | Mapillary | Avg. | |---------------------|----------------|----------------|------------------|----------------| | IBN-Net [1,5] | 37.37 | 34.21 | 36.81 | 36.13 | | DRPC [2] | 42.53 | 38.72 | 38.05 | 39.77 | | ISW [3,5] | 37.20 | 33.36 | 35.57 | 35.38 | | SAN-SAW [4] | 45.33 | 41.18 | 40.77 | 42.43 | | SHADE [5] | 46.66 | 43.66 | 45.50 | 45.27 | | DAFormer (Ours) [6] | 52.65* | 47.89* | 54.66* | 51.73* | | HRDA (Ours) | 57.41* | 49.11* | 61.16* | 55.90* |
* New results of our extension paper
References:
- Pan et al. "Two at once: Enhancing learning and generalization capacities via IBN-Net" in ECCV, 2018.
- Yue et al. "Domain randomization and pyramid consistency: Simulation-to-real generalization without accessing target domain data" ICCV, 2019.
- Choi et al. "RobustNet: Improving Domain Generalization in Urban-Scene Segmentation via Instance Selective Whitening" in CVPR, 2021.
- Peng et al. "Semantic-aware domain generalized segmentation" in CVPR, 2022.
- Zhao et al. "Style-Hallucinated Dual Consistency Learning for Domain Generalized Semantic Segmentation" in ECCV, 2022.
- Hoyer et al. "DAFormer: Improving Network Architectures and Training Strategies for Domain-Adaptive Semantic Segmentation" in CVPR, 2022.
Setup Environment
For this project, we used python 3.8.5. We recommend setting up a new virtual environment:
python -m venv ~/venv/hrda
source ~/venv/hrda/bin/activate
In that environment, the requirements can be installed with:
pip install -r requirements.txt -f https://download.pytorch.org/whl/torch_stable.html
pip install mmcv-full==1.3.7 # requires the other packages to be installed first
Please, download the MiT-B5 ImageNet weights provided by SegFormer
from their OneDrive and put them in the folder pretrained/.
Further, download the checkpoint of HRDA on GTA→Cityscapes and extract it to the folder work_dirs/.
Setup Datasets
Cityscapes: Please, download leftImg8bit_trainvaltest.zip and
gt_trainvaltest.zip from here
and extract them to data/cityscapes.
GTA: Please, download all image and label packages from
here and extract
them to data/gta.
**Synthia (O
Related Skills
node-connect
334.1kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
82.1kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
334.1kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
commit-push-pr
82.1kCommit, push, and open a PR
