HRDA

[ECCV22] Official Implementation of HRDA: Context-Aware High-Resolution Domain-Adaptive Semantic Segmentation

Generate Convert Improve

Install / Use

/learn @lhoyer/HRDA

About this skill

Quality Score

0/100

README

HRDA: Context-Aware High-Resolution Domain-Adaptive Semantic Segmentation

by Lukas Hoyer, Dengxin Dai, and Luc Van Gool

[ECCV22 Paper] [Extension Paper]

:bell: News:

[2024-07-03] We are happy to announce that our work SemiVL on semi-supervised semantic segmentation with vision-language guidance was accepted at ECCV24.
[2024-07-03] We are happy to announce that our follow-up work DGInStyle on image diffusion for domain-generalizable semantic segmentation was accepted at ECCV24.
[2023-09-26] We are happy to announce that our Extension Paper on domain generalization and clear-to-adverse-weather UDA was accapted at PAMI.
[2023-08-25] We are happy to announce that our follow-up work EDAPS on panoptic segmentation UDA was accepted at ICCV23.
[2023-04-27] We further extend HRDA to domain generalization and clear-to-adverse-weather UDA in the Extension Paper.
[2023-02-28] We are happy to announce that our follow-up work MIC on context-enhanced UDA was accepted at CVPR23.
[2022-07-05] We are happy to announce that HRDA was accepted at ECCV22.

Overview

Unsupervised domain adaptation (UDA) aims to adapt a model trained on synthetic data to real-world data without requiring expensive annotations of real-world images. As UDA methods for semantic segmentation are usually GPU memory intensive, most previous methods operate only on downscaled images. We question this design as low-resolution predictions often fail to preserve fine details. The alternative of training with random crops of high-resolution images alleviates this problem but falls short in capturing long-range, domain-robust context information.

Therefore, we propose HRDA, a multi-resolution training approach for UDA, that combines the strengths of small high-resolution crops to preserve fine segmentation details and large low-resolution crops to capture long-range context dependencies with a learned scale attention, while maintaining a manageable GPU memory footprint.

HRDA Overview

HRDA enables adapting small objects and preserving fine segmentation details. It significantly improves the state-of-the-art performance by 5.5 mIoU for GTA→Cityscapes and by 4.9 mIoU for Synthia→Cityscapes, resulting in an unprecedented performance of 73.8 and 65.8 mIoU, respectively.

UDA over time

The more detailed domain-adaptive semantic segmentation of HRDA, compared to the previous state-of-the-art UDA method DAFormer, can also be observed in example predictions from the Cityscapes validation set.

Demo

https://user-images.githubusercontent.com/1277888/181128057-27b8039f-a4c9-4f6d-9aa8-9b7f364d8921.mp4

Color Palette

HRDA can be further extended to domain generalization lifting the requirement of access to target images. Also in domain generalization, HRDA significantly improves the state-of-the-art performance by +4.2 mIoU.

For more information on HRDA, please check our [ECCV Paper] and the [Extension Paper].

If you find HRDA useful in your research, please consider citing:

@InProceedings{hoyer2022hrda,
  title={{HRDA}: Context-Aware High-Resolution Domain-Adaptive Semantic Segmentation},
  author={Hoyer, Lukas and Dai, Dengxin and Van Gool, Luc},
  booktitle={Proceedings of the European Conference on Computer Vision (ECCV)},
  pages={372--391},
  year={2022}
}

@Article{hoyer2024domain,
  title={Domain Adaptive and Generalizable Network Architectures and Training Strategies for Semantic Image Segmentation},
  author={Hoyer, Lukas and Dai, Dengxin and Van Gool, Luc},
  journal={IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI)}, 
  year={2024},
  volume={46},
  number={1},
  pages={220-235},
  doi={10.1109/TPAMI.2023.3320613}
}

Comparison with SOTA UDA

HRDA significantly outperforms previous works on several UDA benchmarks. This includes synthetic-to-real adaptation on GTA→Cityscapes and Synthia→Cityscapes as well as clear-to-adverse-weather adaptation on Cityscapes→ACDC and Cityscapes→DarkZurich.

| | GTA→CS(val) | Synthia→CS(val) | CS→ACDC(test) | CS→DarkZurich(test) | |---------------------|----------------|--------------------|---------------|---------------------| | ADVENT [1] | 45.5 | 41.2 | 32.7 | 29.7 | | BDL [2] | 48.5 | -- | 37.7 | 30.8 | | FDA [3] | 50.5 | -- | 45.7 | -- | | DACS [4] | 52.1 | 48.3 | -- | -- | | ProDA [5] | 57.5 | 55.5 | -- | -- | | MGCDA [6] | -- | -- | 48.7 | 42.5 | | DANNet [7] | -- | -- | 50.0 | 45.2 | | DAFormer (Ours) [8] | 68.3 | 60.9 | 55.4* | 53.8* | | HRDA (Ours) | 73.8 | 65.8 | 68.0* | 55.9* |

* New results of our extension paper

References:

Vu et al. "Advent: Adversarial entropy minimization for domain adaptation in semantic segmentation" in CVPR 2019.
Li et al. "Bidirectional learning for domain adaptation of semantic segmentation" in CVPR 2019.
Yang et al. "Fda: Fourier domain adaptation for semantic segmentation" in CVPR 2020.
Tranheden et al. "Dacs: Domain adaptation via crossdomain mixed sampling" in WACV 2021.
Zhang et al. "Prototypical pseudo label denoising and target structure learning for domain adaptive semantic segmentation" in CVPR 2021.
Sakaridis et al. "Map-guided curriculum domain adaptation and uncertaintyaware evaluation for semantic nighttime image segmentation" in TPAMI, 2020.
Wu et al. "DANNet: A one-stage domain adaptation network for unsupervised nighttime semantic segmentation" in CVPR, 2021.
Hoyer et al. "DAFormer: Improving Network Architectures and Training Strategies for Domain-Adaptive Semantic Segmentation" in CVPR, 2022.

Comparison with SOTA Domain Generalization (DG)

HRDA and DAFormer significantly outperform previous works on domain generalization from GTA to real street scenes.

| DG Method | Cityscapes | BDD100K | Mapillary | Avg. | |---------------------|----------------|----------------|------------------|----------------| | IBN-Net [1,5] | 37.37 | 34.21 | 36.81 | 36.13 | | DRPC [2] | 42.53 | 38.72 | 38.05 | 39.77 | | ISW [3,5] | 37.20 | 33.36 | 35.57 | 35.38 | | SAN-SAW [4] | 45.33 | 41.18 | 40.77 | 42.43 | | SHADE [5] | 46.66 | 43.66 | 45.50 | 45.27 | | DAFormer (Ours) [6] | 52.65* | 47.89* | 54.66* | 51.73* | | HRDA (Ours) | 57.41* | 49.11* | 61.16* | 55.90* |

* New results of our extension paper

References:

Pan et al. "Two at once: Enhancing learning and generalization capacities via IBN-Net" in ECCV, 2018.
Yue et al. "Domain randomization and pyramid consistency: Simulation-to-real generalization without accessing target domain data" ICCV, 2019.
Choi et al. "RobustNet: Improving Domain Generalization in Urban-Scene Segmentation via Instance Selective Whitening" in CVPR, 2021.
Peng et al. "Semantic-aware domain generalized segmentation" in CVPR, 2022.
Zhao et al. "Style-Hallucinated Dual Consistency Learning for Domain Generalized Semantic Segmentation" in ECCV, 2022.
Hoyer et al. "DAFormer: Improving Network Architectures and Training Strategies for Domain-Adaptive Semantic Segmentation" in CVPR, 2022.

Setup Environment

For this project, we used python 3.8.5. We recommend setting up a new virtual environment:

python -m venv ~/venv/hrda
source ~/venv/hrda/bin/activate

In that environment, the requirements can be installed with:

pip install -r requirements.txt -f https://download.pytorch.org/whl/torch_stable.html
pip install mmcv-full==1.3.7  # requires the other packages to be installed first

Please, download the MiT-B5 ImageNet weights provided by SegFormer from their OneDrive and put them in the folder pretrained/. Further, download the checkpoint of HRDA on GTA→Cityscapes and extract it to the folder work_dirs/.

Setup Datasets

Cityscapes: Please, download leftImg8bit_trainvaltest.zip and gt_trainvaltest.zip from here and extract them to data/cityscapes.

GTA: Please, download all image and label packages from here and extract them to data/gta.

**Synthia (O

Related Skills

node-connect

334.1k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

frontend-design

82.1k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

openai-whisper-api

334.1k

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

commit-push-pr

82.1k

Commit, push, and open a PR