MagNet
Progressive Semantic Segmentation (CVPR-2021)
Install / Use
/learn @VinAIResearch/MagNetREADME
Progressive Semantic Segmentation (MagNet)
MagNet, a multi-scale framework that resolves local ambiguity by looking at the image at multiple magnification levels, has multiple processing stages, where each stage corresponds to a magnification level, and the output of one stage is fed into the next stage for coarse-to-fine information propagation. Experiments on three high-resolution datasets of urban views, aerial scenes, and medical images show that MagNet consistently outperforms the state-of-the-art methods by a significant margin.

Details of the MagNet model architecture and experimental results can be found in our following paper:
@inproceedings{m_Huynh-etal-CVPR21,
author = {Chuong Huynh and Anh Tran and Khoa Luu and Minh Hoai},
title = {Progressive Semantic Segmentation},
year = {2021},
booktitle = {Proceedings of the {IEEE} Conference on Computer Vision and Pattern Recognition (CVPR)},
}
Please CITE our paper when MagNet is used to help produce published results or incorporated into other software.
Datasets
This current code provides configurations to train, evaluate on two datasets: Cityscapes and DeepGlobe. To prepare the datasets, in the ./data directory, please do following steps:
For Cityscapes
- Register an account on this page and log in.
- Download
leftImg8bit_trainvaltest.zipandgtFine_trainvaltest.zip. - Run the script below to extract zip files to correct locations:
sh ./prepare_cityscapes.sh
For DeepGlobe
- Register an account on this page and log in.
- Go to this page and download Starting Kit of the
#1 DevelopmentPhase. - Run the script below to extract zip files to correct locations:
sh ./prepare_deepglobe.sh
If you want to train/evaluate with your dataset, follow the steps in this document
Getting started
Requirements
The framework is tested on machines with the following environment:
- Python >= 3.6
- CUDA >= 10.0
To install dependencies, please run the following command:
pip install -r requirements.txt
Pretrained models
Performance of pre-trained models on datasets:
| Dataset | Backbone | Baseline IoU (%) | MagNet IoU (%) | MagNet-Fast IoU (%) | Download | | -------- | -------- | -------- | -------- | -------- | -------- | | Cityscapes | HRNetW18+OCR | 63.24 | 68.20 | 67.37 |backbone<br>refine_512x256<br>refine_1024x512<br>refine_2048x1024 | | DeepGlobe | Resnet50-FPN | 67.22 | 72.10 | 68.22 | backbone<br>refine
Please manually download pre-trained models to ./checkpoints or run the script below:
cd checkpoints
sh ./download_cityscapes.sh # for Cityscapes
# or
sh ./download_deepglobe.sh # for DeepGlobe
Usage
You can run this Google Colab Notebook to test our pre-trained models with street-view images. Please follow the instructions in the notebook to experience the performance of our network.
If you want to test our framework on your local machine:
- To test with a Cityscapes image, e.g
data/frankfurt_000001_003056_leftImg8bit.png:
- With MagNet refinement:
python demo.py --dataset cityscapes \
--image data/frankfurt_000001_003056_leftImg8bit.png \
--scales 256-128,512-256,1024-512,2048-1024 \
--crop_size 256 128 \
--input_size 256 128 \
--model hrnet18+ocr \
--pretrained checkpoints/cityscapes_hrnet.pth \
--pretrained_refinement checkpoints/cityscapes_refinement_512.pth checkpoints/cityscapes_refinement_1024.pth checkpoints/cityscapes_refinement_2048.pth \
--num_classes 19 \
--n_points 32768 \
--n_patches -1 \
--smooth_kernel 5 \
--save_pred \
--save_dir test_results/demo
# or in short, you can run
sh scripts/cityscapes/demo_magnet.sh data/frankfurt_000001_003056_leftImg8bit.png
- With MagNet-Fast refinement
python demo.py --dataset cityscapes \
--image frankfurt_000001_003056_leftImg8bit.png \
--scales 256-128,512-256,1024-512,2048-1024 \
--crop_size 256 128 \
--input_size 256 128 \
--model hrnet18+ocr \
--pretrained checkpoints/cityscapes_hrnet.pth \
--pretrained_refinement checkpoints/cityscapes_refinement_512.pth checkpoints/cityscapes_refinement_1024.pth checkpoints/cityscapes_refinement_2048.pth \
--num_classes 19 \
--n_points 0.9 \
--n_patches 4 \
--smooth_kernel 5 \
--save_pred \
--save_dir test_results/demo
# or in short, you can run
sh scripts/cityscapes/demo_magnet_fast.sh data/frankfurt_000001_003056_leftImg8bit.png
All results will be stored at test_results/demo/frankfurt_000001_003056_leftImg8bit
- To test with a Deepglobe image, e.g
data/639004_sat.jpg:
- With MagNet refinement:
python demo.py --dataset deepglobe \
--image data/639004_sat.jpg \
--scales 612-612,1224-1224,2448-2448 \
--crop_size 612 612 \
--input_size 508 508 \
--model fpn \
--pretrained checkpoints/deepglobe_fpn.pth \
--pretrained_refinement checkpoints/deepglobe_refinement.pth \
--num_classes 7 \
--n_points 0.75 \
--n_patches -1 \
--smooth_kernel 11 \
--save_pred \
--save_dir test_results/demo
# or in short, you can run
sh scripts/deepglobe/demo_magnet.sh data/639004_sat.jpg
- With MagNet-Fast refinement
python demo.py --dataset deepglobe \
--image data/639004_sat.jpg \
--scales 612-612,1224-1224,2448-2448 \
--crop_size 612 612 \
--input_size 508 508 \
--model fpn \
--pretrained checkpoints/deepglobe_fpn.pth \
--pretrained_refinement checkpoints/deepglobe_refinement.pth \
--num_classes 7 \
--n_points 0.9 \
--n_patches 3 \
--smooth_kernel 11 \
--save_pred \
--save_dir test_results/demo
# or in short, you can run
sh scripts/deepglobe/demo_magnet_fast.sh data/639004_sat.jpg
All results will be stored at test_results/demo/639004_sat
Training
Training backbone networks
We customize the training script from HRNet repository to train our backbones. Please first go to this directory ./backbone and run the following scripts:
HRNetW18V2+OCR for Cityscapes
Download pre-trained weights on ImageNet and put into folder ./pretrained_weights.
Training the model:
# In ./backbone
python train.py --cfg experiments/cityscapes/hrnet_ocr_w18_train_256x128_sgd_lr1e-2_wd5e-4_bs_12_epoch484.yaml
The logs of training are stored at ./log/cityscapes/HRNetW18_OCR.
The checkpoint of backbone after training are stored at ./output/cityscapes/hrnet_ocr_w18_train_256x128_sgd_lr1e-2_wd5e-4_bs_12_epoch484/best.pth.
This checkpoint is used to train further refinement modules.
Resnet50-FPN for Deepglobe
Training the model:
# In ./backbone
python train.py --cfg experiments/deepglobe/resnet_fpn_train_612x612_sgd_lr1e-2_wd5e-4_bs_12_epoch484.yaml
The logs of training are stored at ./log/deepglobe/ResnetFPN.
The checkpoint of backbone after training are stored at ./output/deepglobe/resnet_fpn_train_612x612_sgd_lr1e-2_wd5e-4_bs_12_epoch484/best.pth. This checkpoint is used to train further refinement modules.
Training refinement modules
Available arguments for training:
train.py [-h] --dataset DATASET [--root ROOT] [--datalist DATALIST]
--scales SCALES --crop_size N [N ...] --input_size N [N ...]
[--num_workers NUM_WORKERS] --model MODEL --num_classes
NUM_CLASSES --pretrained PRETRAINED
[--pretrained_refinement PRETRAINED_REFINEMENT [PRETRAINED_REFINEMENT ...]]
--batch_size BATCH_SIZE [--log_dir LOG_DIR] --task_name
TASK_NAME [--lr LR] [--momentum MOMENTUM] [--decay DECAY]
[--gamma GAMMA] [--milestones N [N ...]] [--epochs EPOCHS]
optional arguments:
-h, --help show this help message and exit
--dataset DATASET dataset name: cityscapes, deepglobe (default: None)
--root ROOT path to images for training and testing (default: )
--datalist DATALIST path to .txt
Related Skills
node-connect
342.5kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
85.3kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
342.5kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
342.5kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
