MagNet

Progressive Semantic Segmentation (CVPR-2021)

Generate Convert Improve

Install / Use

/learn @VinAIResearch/MagNet

About this skill

Quality Score

0/100

README

Progressive Semantic Segmentation (MagNet)

MagNet, a multi-scale framework that resolves local ambiguity by looking at the image at multiple magnification levels, has multiple processing stages, where each stage corresponds to a magnification level, and the output of one stage is fed into the next stage for coarse-to-fine information propagation. Experiments on three high-resolution datasets of urban views, aerial scenes, and medical images show that MagNet consistently outperforms the state-of-the-art methods by a significant margin.

Details of the MagNet model architecture and experimental results can be found in our following paper:

@inproceedings{m_Huynh-etal-CVPR21,
  author = {Chuong Huynh and Anh Tran and Khoa Luu and Minh Hoai},
  title = {Progressive Semantic Segmentation},
  year = {2021},
  booktitle = {Proceedings of the {IEEE} Conference on Computer Vision and Pattern Recognition (CVPR)},
}

Please CITE our paper when MagNet is used to help produce published results or incorporated into other software.

Datasets

This current code provides configurations to train, evaluate on two datasets: Cityscapes and DeepGlobe. To prepare the datasets, in the ./data directory, please do following steps:

For Cityscapes

Register an account on this page and log in.
Download leftImg8bit_trainvaltest.zip and gtFine_trainvaltest.zip.
Run the script below to extract zip files to correct locations:

sh ./prepare_cityscapes.sh

For DeepGlobe

Register an account on this page and log in.
Go to this page and download Starting Kit of the #1 Development Phase.
Run the script below to extract zip files to correct locations:

sh ./prepare_deepglobe.sh

If you want to train/evaluate with your dataset, follow the steps in this document

Getting started

Requirements

The framework is tested on machines with the following environment:

Python >= 3.6
CUDA >= 10.0

To install dependencies, please run the following command:

pip install -r requirements.txt

Pretrained models

Performance of pre-trained models on datasets:

| Dataset | Backbone | Baseline IoU (%) | MagNet IoU (%) | MagNet-Fast IoU (%) | Download | | -------- | -------- | -------- | -------- | -------- | -------- | | Cityscapes | HRNetW18+OCR | 63.24 | 68.20 | 67.37 |backbone refine_512x256 refine_1024x512 refine_2048x1024 | | DeepGlobe | Resnet50-FPN | 67.22 | 72.10 | 68.22 | backbone refine

Please manually download pre-trained models to ./checkpoints or run the script below:

cd checkpoints
sh ./download_cityscapes.sh # for Cityscapes
# or
sh ./download_deepglobe.sh # for DeepGlobe

Usage

You can run this Google Colab Notebook to test our pre-trained models with street-view images. Please follow the instructions in the notebook to experience the performance of our network.

If you want to test our framework on your local machine:

To test with a Cityscapes image, e.g data/frankfurt_000001_003056_leftImg8bit.png:

With MagNet refinement:

python demo.py --dataset cityscapes \
               --image data/frankfurt_000001_003056_leftImg8bit.png \
               --scales 256-128,512-256,1024-512,2048-1024 \
               --crop_size 256 128 \
               --input_size 256 128 \
               --model hrnet18+ocr \
               --pretrained checkpoints/cityscapes_hrnet.pth \
               --pretrained_refinement checkpoints/cityscapes_refinement_512.pth checkpoints/cityscapes_refinement_1024.pth checkpoints/cityscapes_refinement_2048.pth \
               --num_classes 19 \
               --n_points 32768 \
               --n_patches -1 \
               --smooth_kernel 5 \
               --save_pred \
               --save_dir test_results/demo

# or in short, you can run
sh scripts/cityscapes/demo_magnet.sh data/frankfurt_000001_003056_leftImg8bit.png

With MagNet-Fast refinement

python demo.py --dataset cityscapes \
               --image frankfurt_000001_003056_leftImg8bit.png \
               --scales 256-128,512-256,1024-512,2048-1024 \
               --crop_size 256 128 \
               --input_size 256 128 \
               --model hrnet18+ocr \
               --pretrained checkpoints/cityscapes_hrnet.pth \
               --pretrained_refinement checkpoints/cityscapes_refinement_512.pth checkpoints/cityscapes_refinement_1024.pth checkpoints/cityscapes_refinement_2048.pth \
               --num_classes 19 \
               --n_points 0.9 \
               --n_patches 4 \
               --smooth_kernel 5 \
               --save_pred \
               --save_dir test_results/demo

# or in short, you can run
sh scripts/cityscapes/demo_magnet_fast.sh data/frankfurt_000001_003056_leftImg8bit.png

All results will be stored at test_results/demo/frankfurt_000001_003056_leftImg8bit

To test with a Deepglobe image, e.g data/639004_sat.jpg:

With MagNet refinement:

python demo.py --dataset deepglobe \
               --image data/639004_sat.jpg \
               --scales 612-612,1224-1224,2448-2448 \
               --crop_size 612 612 \
               --input_size 508 508 \
               --model fpn \
               --pretrained checkpoints/deepglobe_fpn.pth \
               --pretrained_refinement checkpoints/deepglobe_refinement.pth \
               --num_classes 7 \
               --n_points 0.75 \
               --n_patches -1 \
               --smooth_kernel 11 \
               --save_pred \
               --save_dir test_results/demo

# or in short, you can run
sh scripts/deepglobe/demo_magnet.sh data/639004_sat.jpg

With MagNet-Fast refinement

python demo.py --dataset deepglobe \
               --image data/639004_sat.jpg \
               --scales 612-612,1224-1224,2448-2448 \
               --crop_size 612 612 \
               --input_size 508 508 \
               --model fpn \
               --pretrained checkpoints/deepglobe_fpn.pth \
               --pretrained_refinement checkpoints/deepglobe_refinement.pth \
               --num_classes 7 \
               --n_points 0.9 \
               --n_patches 3 \
               --smooth_kernel 11 \
               --save_pred \
               --save_dir test_results/demo

# or in short, you can run
sh scripts/deepglobe/demo_magnet_fast.sh data/639004_sat.jpg

All results will be stored at test_results/demo/639004_sat

Training

Training backbone networks

We customize the training script from HRNet repository to train our backbones. Please first go to this directory ./backbone and run the following scripts:

HRNetW18V2+OCR for Cityscapes

Download pre-trained weights on ImageNet and put into folder ./pretrained_weights.

Training the model:

# In ./backbone
python train.py --cfg experiments/cityscapes/hrnet_ocr_w18_train_256x128_sgd_lr1e-2_wd5e-4_bs_12_epoch484.yaml

The logs of training are stored at ./log/cityscapes/HRNetW18_OCR.

The checkpoint of backbone after training are stored at ./output/cityscapes/hrnet_ocr_w18_train_256x128_sgd_lr1e-2_wd5e-4_bs_12_epoch484/best.pth. This checkpoint is used to train further refinement modules.

Resnet50-FPN for Deepglobe

Training the model:

# In ./backbone
python train.py --cfg experiments/deepglobe/resnet_fpn_train_612x612_sgd_lr1e-2_wd5e-4_bs_12_epoch484.yaml

The logs of training are stored at ./log/deepglobe/ResnetFPN.

The checkpoint of backbone after training are stored at ./output/deepglobe/resnet_fpn_train_612x612_sgd_lr1e-2_wd5e-4_bs_12_epoch484/best.pth. This checkpoint is used to train further refinement modules.

Training refinement modules

Available arguments for training:

train.py [-h] --dataset DATASET [--root ROOT] [--datalist DATALIST]
                --scales SCALES --crop_size N [N ...] --input_size N [N ...]
                [--num_workers NUM_WORKERS] --model MODEL --num_classes
                NUM_CLASSES --pretrained PRETRAINED
                [--pretrained_refinement PRETRAINED_REFINEMENT [PRETRAINED_REFINEMENT ...]]
                --batch_size BATCH_SIZE [--log_dir LOG_DIR] --task_name
                TASK_NAME [--lr LR] [--momentum MOMENTUM] [--decay DECAY]
                [--gamma GAMMA] [--milestones N [N ...]] [--epochs EPOCHS]

optional arguments:
  -h, --help            show this help message and exit
  --dataset DATASET     dataset name: cityscapes, deepglobe (default: None)
  --root ROOT           path to images for training and testing (default: )
  --datalist DATALIST   path to .txt

Related Skills

node-connect

342.5k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

frontend-design

85.3k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

openai-whisper-api

342.5k

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

qqbot-media

342.5k

QQBot 富媒体收发能力。使用 <qqmedia> 标签，系统根据文件扩展名自动识别类型（图片/语音/视频/文件）。