U2PL

[CVPR'22 & IJCV'24] Semi-Supervised Semantic Segmentation Using Unreliable Pseudo-Labels & Using Unreliable Pseudo-Labels for Label-Efficient Semantic Segmentation

Generate Convert Improve

Install / Use

/learn @Haochen-Wang409/U2PL

About this skill

Quality Score

0/100

README

Using Unreliable Pseudo Labels

Official PyTorch implementation of Semi-Supervised Semantic Segmentation Using Unreliable Pseudo Labels, CVPR 2022.

:bell::bell::bell: The extension of this paper (U2PL+) has been accepted by IJCV! :bell::bell::bell:

Please refer to our project page for qualitative results.

Abstract. The crux of semi-supervised semantic segmentation is to assign adequate pseudo-labels to the pixels of unlabeled images. A common practice is to select the highly confident predictions as the pseudo ground-truth, but it leads to a problem that most pixels may be left unused due to their unreliability. We argue that every pixel matters to the model training, even its prediction is ambiguous. Intuitively, an unreliable prediction may get confused among the top classes (i.e., those with the highest probabilities), however, it should be confident about the pixel not belonging to the remaining classes. Hence, such a pixel can be convincingly treated as a negative sample to those most unlikely categories. Based on this insight, we develop an effective pipeline to make sufficient use of unlabeled data. Concretely, we separate reliable and unreliable pixels via the entropy of predictions, push each unreliable pixel to a category-wise queue that consists of negative samples, and manage to train the model with all candidate pixels. Considering the training evolution, where the prediction becomes more and more accurate, we adaptively adjust the threshold for the reliable-unreliable partition. Experimental results on various benchmarks and training settings demonstrate the superiority of our approach over the state-of-the-art alternatives.

Results

PASCAL VOC 2012

Labeled images are selected from the train set of original VOC, 1,464 images in total. And the remaining 9,118 images are all considered as unlabeled ones.

For instance, 1/2 (732) represents 732 labeled images and remaining 9,850 (9,118 + 732) are unlabeled.

| Method | 1/16 (92) | 1/8 (183) | 1/4 (366) | 1/2 (732) | Full (1464) | | --------------------------- | --------- | --------- | --------- | --------- | ----------- | | SupOnly | 45.77 | 54.92 | 65.88 | 71.69 | 72.50 | | U2PL (w/ CutMix) | 67.98 | 69.15 | 73.66 | 76.16 | 79.49 |

Labeled images are selected from the train set of augmented VOC, 10,582 images in total.

Following results are all trained under our own splits. Training a model on different splits is recommended to measure the performance of a method. You can train our U2PL on splits provided by CPS or ST++.

| Method | 1/16 (662) | 1/8 (1323) | 1/4 (2646) | 1/2 (5291) | | --------------------------- | ---------- | ---------- | ---------- | ---------- | | SupOnly | 67.87 | 71.55 | 75.80 | 77.13 | | U2PL (w/ CutMix) | 77.21 | 79.01 | 79.30 | 80.50 |

Cityscapes

Labeled images are selected from the train set, 2,975 images in total.

| Method | 1/16 (186) | 1/8 (372) | 1/4 (744) | 1/2 (1488) | | --------------------------- | ---------- | --------- | --------- | ---------- | | SupOnly | 65.74 | 72.53 | 74.43 | 77.83 | | U2PL (w/ CutMix) | 70.30 | 74.37 | 76.47 | 79.05 | | U2PL (w/ AEL) | 74.90 | 76.48 | 78.51 | 79.12 |

Checkpoints

Models on PASCAL VOC 2012 (ResNet101-DeepLabv3+) can be found here.
Models on Cityscapes with AEL (ResNet101-DeepLabv3+)

| 1/16 (186) | 1/8 (372) | 1/4 (744) | 1/2 (1488) | | -------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------- | | Google Drive | Google Drive | Google Drive | Google Drive | | Baidu Drive Fetch Code: rrpd | Baidu Drive Fetch Code: welw | Baidu Drive Fetch Code: qwcd | Baidu Drive Fetch Code: 4p8r |

Installation

git clone https://github.com/Haochen-Wang409/U2PL.git && cd U2PL
conda create -n u2pl python=3.6.9
conda activate u2pl
pip install -r requirements.txt
pip install pip install torch==1.8.1+cu102 torchvision==0.9.1+cu102 -f https://download.pytorch.org/whl/torch_stable.html

Usage

U2PL is evaluated on both Cityscapes and PASCAL VOC 2012 dataset.

Prepare Data

<details> <summary>For Cityscapes</summary>

Download "leftImg8bit_trainvaltest.zip" from: https://www.cityscapes-dataset.com/downloads/

Download "gtFine.zip" from: https://drive.google.com/file/d/10tdElaTscdhojER_Lf7XlytiyAkk7Wlg/view?usp=sharing

Next, unzip the files to folder data and make the dictionary structures as follows:

data/cityscapes
├── gtFine
│   ├── test
│   ├── train
│   └── val
└── leftImg8bit
    ├── test
    ├── train
    └── val

</details> <details> <summary>For PASCAL VOC 2012</summary>

Refer to this link and download PASCAL VOC 2012 augmented with SBD dataset.

And unzip the files to folder data and make the dictionary structures as follows:

data/VOC2012
├── Annotations
├── ImageSets
├── JPEGImages
├── SegmentationClass
├── SegmentationClassAug
└── SegmentationObject

</details>

Finally, the structure of dictionary data should be as follows:

data
├── cityscapes
│   ├── gtFine
│   └── leftImg8bit
├── splits
│   ├── cityscapes
│   └── pascal
└── VOC2012
    ├── Annotations
    ├── ImageSets
    ├── JPEGImages
    ├── SegmentationClass
    ├── SegmentationClassAug
    └── SegmentationObject

Prepare Pretrained Backbone

Before training, please download ResNet101 pretrained on ImageNet-1K from one of the following:

Google Drive
Baidu Drive Fetch Code: 3p9h

After that, modify model_urls in semseg/models/resnet.py to </path/to/resnet101.pth>

Train a Fully-Supervised Model

For instance, we can train a model on PASCAL VOC 2012 with only 1464 labeled data for supervision by:

cd experiments/pascal/1464/suponly
# use torch.distributed.launch
sh train.sh <num_gpu> <port>

# or use slurm
# sh slurm_train.sh <num_gpu> <port> <partition>

Or for Cityscapes, a model supervised by only 744 labeled data can be trained by:

cd experiments/cityscapes/744/suponly
# use torch.distributed.launch
sh train.sh <num_gpu> <port>

# or use slurm
# sh slurm_train.sh <num_gpu> <port> <partition>

After training, the model should be evaluated by

sh eval.sh

Train a Semi-Supervised Model

We can train a model on PASCAL VOC 2012 with 1464 labeled data and 9118 unlabeled data for supervision by:

cd experiments/pascal/1464/ours
# use torch.distrib

Related Skills

proje

Interactive vocabulary learning platform with smart flashcards and spaced repetition for effective language acquisition.

YC-Killer

2.7k

A library of enterprise-grade AI agents designed to democratize artificial intelligence and provide free, open-source alternatives to overvalued Y Combinator startups. If you are excited about democratizing AI access & AI agents, please star ⭐️ this repository and use the link in the readme to join our open source AI research team.

best-practices-researcher

The most comprehensive Claude Code skills registry | Web Search: https://skills-registry-web.vercel.app

groundhog

400

Groundhog's primary purpose is to teach people how Cursor and all these other coding agents work under the hood. If you understand how these coding assistants work from first principles, then you can drive these tools harder (or perhaps make your own!).

Haochen-Wang409

View profile

View on GitHub

GitHub Stars475

CategoryEducation

Updated11d ago

Forks60

Haochen-Wang409/U2PL

Languages

Python

Security Score

100/100

Audited on Mar 26, 2026

No findings